hobo_qaqc Module¶
-
class
hobo_qaqc.
HOBOdata
[source]¶ Load and process data from HOBO loggers produced by the ONSET company.
Handles csv files exported from the HoboWare program. The native format for HOBO loggers is a .hobo file. This proprietary binary file is not handled here and must be converted to a csv.
This class syncs timesteps, checks time zones, and units, and converts where needed.
-
export_to_GCE_csv
(csvname, units, tz)[source]¶ Export the HOBO data to a GCE friendly csv file
Parameters: - csvname – str. Filepath to output csv file
- units – str. Units of output data. Example: ‘SI’.
- tz – float. GMT time zone of output data series. Example: -8.
-
format_QAQC_data
(units='SI', tz=-8, tstep='5min')[source]¶ Reformat the data using basic QAQC for SI or US units and time zone consistency regardless of daylight savings.
Parameters: - units – str. keyword argument. The desired system of units. Default is ‘SI’.
- tz – flt. keyword argument. The desired time zone as an offset from Greenwich Mean Time. Default is -8 (PST)
- tstep – keyword argument. Interval to round time stamps to. Default ‘5min’.
Note
tstep is input to the function
HOBOdata.format_sync_timestep()
. Valid types are listed there.
-
format_intensity
(col='Intensity', unit='Lux')[source]¶ Format light intensity records in desired units
Parameters: - col – keyword argument. str. Name of column containing light intensity data. Defaults to ‘Intensity’.
- unit – keyword argument. str defining desired units. Default is ‘Lux’ (SI)
-
format_sync_timestep
(n_min='5min')[source]¶ Sync timestamps to a defined measurement interval. Timestamps are increased to the next defined interval.
Parameters: n_min – str. keyword argument. Interval to round time stamps to. Default ‘5min’. Note
This uses the function ceil to round up to the next interval. The interval provided must match a known type and contain both a number and a letter such as ‘1D’ to round up to the next whole day.
See documentation for valid types [1]
Warning
This will change the index and timestamp of every record.
[1] : https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
-
format_temp
(col='Temp', unit='C')[source]¶ Format temperature records to desired units
Parameters: - col – keyword argurment. str. Name of column containing temperature data. Defaults to ‘Temp’
- unit – keyword argument. str defining desired unit. Default is ‘C’
-
format_timezone
(tz=-8)[source]¶ Check that timezone is correct, and if not, adjust the time zone.
Parameters: tz – a timezone as number of hours offset from Greenwhich Mean Time
-
get_csv_GMT_offset
(header, lineno=-1)[source]¶ Get timezone as an offset from Greenwhich Mean Time from the header file
Parameters: - lineno – keyword argument. index of header array. Function operates on specified index. Default -1
- header – array of header lines where each line is a single string.
Returns: string of timezone offset from GMT
Example:
String for PST '-08:00'
-
get_csv_col
(header, sep, lineno=-1)[source]¶ Extract column names from csv format.
From multiple header lines, this extracts a single line, and strips extra info, leaving only column names. File delimiter is used to split header into columns, and ‘,’ is used to split info within a column.
Example:
Singles string header: ['"#","Date","Time, GMT-08:00","Temp, °C (LGR S/N: 920980, SEN S/N: 920980)","Intensity, Lux (LGR S/N: 920980, SEN S/N: 920980)"\n'] becomes a list of column strings: ['#', 'Date', 'Time', 'Temp', 'Intensity']
Parameters: - header – array of header lines where each line is a single string.
- lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns: array of column names.
-
get_csv_intensity_unit
(header, lineno=-1)[source]¶ Get unit for sunlight intensity
Parameters: - header – array of header lines where each line is a single string
- lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns: str defining units for sunlight intensity
-
get_csv_sn
(header, lineno=-1)[source]¶ Parameters: - header – array of header lines where each line is a single string.
- lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns: str containing serial number
-
get_csv_temp_unit
(header, lineno=-1)[source]¶ Get unit for temperature records
Parameters: - header – array of header lines where each line is a single string.
- lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns: str with single letter defining units for temperature.
-
get_delimiter
(header, lineno=-1)[source]¶ Find the delimiter used in the csv file.
AS of 3/9/21, the only possible delimiters when exporting from HOBOware are , ; and , . This method tests for which one is used, and returns the answer.
Parameters: - header – array of header lines where each line is a single string.
- lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns: str containing delimiter
-
get_header_nlines
(file_name)[source]¶ Estimate how many header lines exist in a file.
Parameters: file_name – str containing file path Returns: int that is index of last header line Warning
This is a simplistic filter that searches for the first row where there are < 8 letters. 8 letters allow for 12 hour time format (AM/PM) plus ‘Logged’, while separating number data from text headers
Complex files with headers that are numerical and special character, or text data will break the method.
Example:
'Plot Title: RS12' '#','Date Time, GMT-07:00','Temp, °C','Intensity, lum/ft²','Coupler Attached','Stopped','End Of File' 1,11/17/2014 11:10:00 AM,3.472,16.0,Logged,, returns 2
-
get_timestamp_col
(col)[source]¶ Time stamps can be exported by HOBO into either 1 or 2 columns
Parameters: col – an array of column names Returns: list of index locations Returns: list of column name(s) that make the timestamp
-
intensity_lumft2_to_lux
(intensity)[source]¶ Convert light intensity records from lumen ft-2 into Lux
Parameters: intensity – an intensity value or list of intensity values in lumen ft-2 Returns: an intensity or list of intensity values in Lux
-
is_intensity_lux
()[source]¶ Read units definition from header and return True if units are Lux
Returns: Boolean. True if light intensity is recorded in Lux
-
is_temp_celsius
()[source]¶ Read units definition from header and return true if units are celsius
Returns: Boolean. True if temperature is recorded in celsius.
-
is_timezone_correct
(tz)[source]¶ Check the timezone in which data was recorded against the expected timezone
Parameters: tz – a timezone as number of hours offset from Greenwhich Mean Time Returns: Boolean
-
load_csv_data
(fname)[source]¶ Load csv file output by HOBO pendants into a Pandas DataFrame.
Parameters: fname – str. Filepath of csv data file
-
read_csv_header
(file_name)[source]¶ Read the header lines from the beginning of a file. Reads n_lines, and stores them as headers object.
Parameters: file_name – str. File path of file to be read.
-
reformat_HOBO_csv
(infname, outfname=None, units='SI', tz=-8, tstep='5min')[source]¶ Imports a csv file output by HoboWare software and checks for:
- units
- timezone
- time sync (09:07 vs 09:05)
File is converted to specified settings and exported to a GCE friendly format.
Parameters: - infname – str. Filename to read
- outfname – str. Filename to ouput. Defaults to same as infname
- units – str. System of units desired. Defaults to SI
- tz – int or flt. Timezone as offset from GMT
- tstep – str. Time interval to sync to. Default is ‘5min’. See
HOBOdata.format_sync_timestep()
or [2] for valid formats.
[2] : https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
-