01 The necessity of JOISS data quality management
With the rapid development of ocean observation technology, the methods of acquiring ocean data are also diversifying, and when a new observation method is used, it is difficult to apply the existing standards and quality control technology because the production form, period, format, etc. of the data are very different from before. cases are occurring. Even when the same method is used, the storage method and format are constantly changing, so it is necessary to develop quality control technology for the joint use of high-cost marine data with continuous interest.
Although JOISS is not a direct producer of data, JOISS is responsible for quality data management as a distributor who collects data produced by research projects and provides data directly to users to activate the utilization of the collected data. Therefore, in order to perform more standardized quality control, it is beneficial to refer to and utilize the preceding best practices. In this regard, JOISS quality management policies and procedures refer to a number of internationally recommended quality management standards.
In addition, documentation of processes throughout data quality management must be performed. Since various types of data can be collected for each newly collected data, regular updates are required, which is a very important process to increase the reliability of the data of researchers who use the data provided by JOISS in the future.
02 Data Quality Control flags
A quality flag is assigned to each data value. The quality flag is used to describe the quality of the data and the data value is not changed. The meaning of the quality flag of JOISS is shown in the table below.
flag | Justice | Detail |
---|---|---|
0 | Performing non (no quality control) | Data without QC or values not subject to quality control |
1 | Good (good value) | QC Results Verified Quality Values |
2 | Good estimates (probably good value) | It is a value that can appear in practice, but a value that has not been subjected to sufficient QC |
3 | doubt (probably bad value) | Values that may be invalid data |
4 | Bad (bad value) | obviously wrong value |
03 Data quality management
04 Data quality management items
The check items carried out by JOISS are as follows.
NO. | check name | note |
---|---|---|
1 | Date and time check (Impossible date test) | necessary |
2 | Also check the Pseudepigrapha (Impossible location test) | necessary |
3 | Check the location (Position on land test) | necessary |
4 | Global Range Check (Global range test) | recommend |
5 | Geographical range checking (Regional range test) | recommend |
6 | Check twimgap (Spike test) | recommend |
7 | Also check changes (Gradient test) | recommend |
8 | Taxonomic check (Taxonomic match) | proposal |
Date and time check
- Year : Confirm 4 digits
- Month : January to December
- Date : Check if it is included in the month
- Hour : 0 ~ 23:00 Check
- Minutes : Check 00 to 59 minutes
- Make sure the date and time are included within the research project period
stomach and longitude check
- Make sure your latitude is within the range of -90 to 90
- Check that the longitude is within the range of -180 to 180
position check
- Check if the vertex is located in the ocean
The above date and time, latitude and longitude, and location check are mandatory. If there is an outlier in the result, it cannot be entered into the JOISS database.
Therefore, when an outlier is found, the initial data provider is notified, and the data is checked and corrected, and then entered into the database.
Then, according to the data, the following recommendations and suggestions are checked.
Global coverage check
-
Tests that the observed parameter values are within the range that can be observed in the ocean
GTSPP Real-Time Quality Control Manual_Revised Edition, 2010 (Version 1.0) is referenced as the globally impossible parameter value,
and the unit limit value and the measurement limit value of commercially available devices for observation, experiment, and data processing are considered.
Regional coverage check
-
Tests that the observed parameter values are within the range that can be observed in a particular ocean.
The WORLD OCEAN DATABASE 2013 USER27S MANUAL is referenced as a value of a parameter that is not possible in a specific region,
and the range value is used by regionally dividing the West Sea, South Sea, and East Sea.
spring value check
-
Measurement of values that are significantly different from adjacent values, i.e., measure the values that bounce (check magnitude and slope)
test value = | V2 - (V3 + V1) / 2 | - | (V3 - V1) / 2 |
where V2 is the measured value, and V1 and V3 are the adjacent previous and next values.
Gradient check
-
If the difference between adjacent measurements is too steep, measure the gradient
test value = | V2 - (V3 + V1) / 2 |
where V2 is the measured value and V1 and V3 are the adjacent previous and next values
05 Data quality control mark (QC Flag) applied
Codes for non-performing, passing, and failing are assigned to the results of each check item.
code | Explanation | |
---|---|---|
N | not performed (na) | Do not perform that check |
P | pass (pass) | passed the check |
E | Eliminated (fail) | Failed to pass that check |
The final QC flag is given by combining the results of each check item. The table below explains the QC flag results for the depth value (10m) that passed the date and time, latitude and longitude, location, and global range check as an example and did not perform the regional range check.
value | Date Time Check | tomach and longitude check | position check | Global coverage check | Regional coverage check | QC Flag |
---|---|---|---|---|---|---|
10m | P | P | P | P | N | 2 (good forecast) |
- · Passed the date and time test
- · Passed the stomach and hardness test
- · Passed the location test
- · Passed the global coverage test
- · No regional coverage test