ctdam.parser.datafiles module

class ctdam.parser.datafiles.DataFile(path_to_file, only_header=False)[source]

Bases: object

The base class for all Sea-Bird data files, which are .cnv, .btl, and .bl . One instance of this class, or its children, represents one data text file. The different information bits of such a file are structured into individual lists or dictionaries. The data table will be loaded as numpy array and can be converted to a pandas DataFrame. Datatype-specific behavior is implemented in the subclasses.

Parameters:
  • path_to_file (Path | str) – The file to the data file.

  • only_header (bool) – Whether to stop reading the file after the metadata header.

read_event_information(regex_string='(?P<c>[a-z]{1,3}\\\\d{1,3})(-|_|\\\\/)?(?P<cn>1|2)?(-|_)(?P<s>\\\\d{1,4})(-|_)(?P<e>\\\\d{1,2})', leading_zeroes=False)[source]

Save the event metadata of the cast inside self.station .

Additionally save cruise information inside self.cruise, if possible. The data sources are file name and custom metadata header, in this order.

Parameters:
  • regex_string (str) – The regex to use for event metadata retrieval

  • leading_zeroes (bool) – Whether to save the info with leading zeroes (Default value = False)

read_file()[source]

Reads and structures all the different information present in the file.

Lists and Dictionaries are the data structures of choice. Uses basic prefix checking to distinguish different header information.

reading_start_time()[source]

Extracts the Cast start time from the metadata header.

Return type:

datetime | None

sensor_xml_to_flattened_dict(sensor_data)[source]

Reads the pure xml sensor input and creates a multilevel dictionary, dropping the first two dictionaries, as they are single entry only.

Parameters:

sensor_data (str) – The raw xml sensor data.

Return type:

list[dict] | dict

structure_metadata(metadata_list)[source]

Creates a dictionary to store custom metadata, of which Sea-Bird allows 12 lines in each file.

Parameters:

list (metadata_list) – A list of the individual lines of metadata found in the file

Return type:

dict

define_output_path(file_path=None, file_name=None, file_type='.csv')[source]

Creates a Path object holding the desired output path.

Parameters:
  • file_path (Path | str | None) – Directory the file sits in (Default value = self.file_dir)

  • file_name (str | None) – The original file name (Default value = self.file_name)

  • file_type (str) – The file suffix (Default value = “.csv”)

Return type:

Path

to_csv(data, with_header=True, output_file_path=None, output_file_name=None)[source]

Writes a .csv file from the given data.

Parameters:
  • data (DataFrame | ndarray) – The source data to use.

  • with_header (bool) – Indicating whether the header shall appear in the output (Default value = True)

  • output_file_path (Path | str | None) – File directory (Default value = None)

  • output_file_name (str | None) – Original file name (Default value = None)

selecting_columns(list_of_columns, df)[source]

Alters the dataframe to only hold the given columns.

Parameters:
  • list_of_columns (list | str) – A collection of columns

  • df (DataFrame) – Dataframe (Default value = None)