hobo_qaqc Module

class hobo_qaqc.HOBOdata[source]

Load and process data from HOBO loggers produced by the ONSET company.

Handles csv files exported from the HoboWare program. The native format for HOBO loggers is a .hobo file. This proprietary binary file is not handled here and must be converted to a csv.

This class syncs timesteps, checks time zones, and units, and converts where needed.

export_to_GCE_csv(csvname, units, tz)[source]

Export the HOBO data to a GCE friendly csv file

Parameters:
  • csvname – str. Filepath to output csv file
  • units – str. Units of output data. Example: ‘SI’.
  • tz – float. GMT time zone of output data series. Example: -8.
format_QAQC_data(units='SI', tz=-8, tstep='5min')[source]

Reformat the data using basic QAQC for SI or US units and time zone consistency regardless of daylight savings.

Parameters:
  • units – str. keyword argument. The desired system of units. Default is ‘SI’.
  • tz – flt. keyword argument. The desired time zone as an offset from Greenwich Mean Time. Default is -8 (PST)
  • tstep – keyword argument. Interval to round time stamps to. Default ‘5min’.

Note

tstep is input to the function HOBOdata.format_sync_timestep(). Valid types are listed there.

format_intensity(col='Intensity', unit='Lux')[source]

Format light intensity records in desired units

Parameters:
  • col – keyword argument. str. Name of column containing light intensity data. Defaults to ‘Intensity’.
  • unit – keyword argument. str defining desired units. Default is ‘Lux’ (SI)
format_sync_timestep(n_min='5min')[source]

Sync timestamps to a defined measurement interval. Timestamps are increased to the next defined interval.

Parameters:n_min – str. keyword argument. Interval to round time stamps to. Default ‘5min’.

Note

This uses the function ceil to round up to the next interval. The interval provided must match a known type and contain both a number and a letter such as ‘1D’ to round up to the next whole day.

See documentation for valid types [1]

Warning

This will change the index and timestamp of every record.

[1]: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
format_temp(col='Temp', unit='C')[source]

Format temperature records to desired units

Parameters:
  • col – keyword argurment. str. Name of column containing temperature data. Defaults to ‘Temp’
  • unit – keyword argument. str defining desired unit. Default is ‘C’
format_timezone(tz=-8)[source]

Check that timezone is correct, and if not, adjust the time zone.

Parameters:tz – a timezone as number of hours offset from Greenwhich Mean Time
get_csv_GMT_offset(header, lineno=-1)[source]

Get timezone as an offset from Greenwhich Mean Time from the header file

Parameters:
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
  • header – array of header lines where each line is a single string.
Returns:

string of timezone offset from GMT

Example:

String for PST  '-08:00'
get_csv_col(header, sep, lineno=-1)[source]

Extract column names from csv format.

From multiple header lines, this extracts a single line, and strips extra info, leaving only column names. File delimiter is used to split header into columns, and ‘,’ is used to split info within a column.

Example:

Singles string header:
['"#","Date","Time, GMT-08:00","Temp, °C (LGR S/N: 920980, SEN S/N: 920980)","Intensity, Lux (LGR S/N:
920980, SEN S/N: 920980)"\n']

becomes a list of column strings:

['#', 'Date', 'Time', 'Temp', 'Intensity']
Parameters:
  • header – array of header lines where each line is a single string.
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

array of column names.

get_csv_intensity_unit(header, lineno=-1)[source]

Get unit for sunlight intensity

Parameters:
  • header – array of header lines where each line is a single string
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

str defining units for sunlight intensity

get_csv_sn(header, lineno=-1)[source]
Parameters:
  • header – array of header lines where each line is a single string.
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

str containing serial number

get_csv_temp_unit(header, lineno=-1)[source]

Get unit for temperature records

Parameters:
  • header – array of header lines where each line is a single string.
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

str with single letter defining units for temperature.

get_delimiter(header, lineno=-1)[source]

Find the delimiter used in the csv file.

AS of 3/9/21, the only possible delimiters when exporting from HOBOware are , ; and , . This method tests for which one is used, and returns the answer.

Parameters:
  • header – array of header lines where each line is a single string.
  • lineno – keyword argument. index of header array. Function operates on specified index. Default -1
Returns:

str containing delimiter

get_header_nlines(file_name)[source]

Estimate how many header lines exist in a file.

Parameters:file_name – str containing file path
Returns:int that is index of last header line

Warning

This is a simplistic filter that searches for the first row where there are < 8 letters. 8 letters allow for 12 hour time format (AM/PM) plus ‘Logged’, while separating number data from text headers

Complex files with headers that are numerical and special character, or text data will break the method.

Example:

'Plot Title: RS12'
'#','Date Time, GMT-07:00','Temp, °C','Intensity, lum/ft²','Coupler Attached','Stopped','End Of File'
1,11/17/2014 11:10:00 AM,3.472,16.0,Logged,,

returns 2
get_timestamp_col(col)[source]

Time stamps can be exported by HOBO into either 1 or 2 columns

Parameters:col – an array of column names
Returns:list of index locations
Returns:list of column name(s) that make the timestamp
intensity_lumft2_to_lux(intensity)[source]

Convert light intensity records from lumen ft-2 into Lux

Parameters:intensity – an intensity value or list of intensity values in lumen ft-2
Returns:an intensity or list of intensity values in Lux
is_intensity_lux()[source]

Read units definition from header and return True if units are Lux

Returns:Boolean. True if light intensity is recorded in Lux
is_temp_celsius()[source]

Read units definition from header and return true if units are celsius

Returns:Boolean. True if temperature is recorded in celsius.
is_timezone_correct(tz)[source]

Check the timezone in which data was recorded against the expected timezone

Parameters:tz – a timezone as number of hours offset from Greenwhich Mean Time
Returns:Boolean
load_csv_data(fname)[source]

Load csv file output by HOBO pendants into a Pandas DataFrame.

Parameters:fname – str. Filepath of csv data file
read_csv_header(file_name)[source]

Read the header lines from the beginning of a file. Reads n_lines, and stores them as headers object.

Parameters:file_name – str. File path of file to be read.
reformat_HOBO_csv(infname, outfname=None, units='SI', tz=-8, tstep='5min')[source]

Imports a csv file output by HoboWare software and checks for:

  • units
  • timezone
  • time sync (09:07 vs 09:05)

File is converted to specified settings and exported to a GCE friendly format.

Parameters:
  • infname – str. Filename to read
  • outfname – str. Filename to ouput. Defaults to same as infname
  • units – str. System of units desired. Defaults to SI
  • tz – int or flt. Timezone as offset from GMT
  • tstep – str. Time interval to sync to. Default is ‘5min’. See HOBOdata.format_sync_timestep() or [2] for valid formats.
[2]: https://pandas.pydata.org/pandas-docs/stable/timeseries.html#offset-aliases
set_data_GMT_offset(hr_offset)[source]

Define time zone of DataFrame timestamps in offset from UTC/GMT

Parameters:hr_offset – floating point of time zone in hours difference from Greenwhich Mean Time
temp_F_to_C(temp)[source]

Convert temperature records from Fahrenheit

Parameters:temp – a temperature value or list of temperature values in degrees fahrenheit.
Returns:a temperature value or list of temperature values in degrees celsius