src.master package¶
Submodules¶
src.master.main module¶
-
src.master.main.
get_combo_string
(records: <sphinx.ext.autodoc.importer._MockObject object at 0x7f5c804d6d30>, cols: list)[source]¶ Paste column values together, separated by ‘_’.
Parameters: - records (pd.DataFrame) – Input dataset.
- cols (list) – Columns to be pasted together.
Returns: List of pasted column values.
Return type: list
-
src.master.main.
get_new_records
(records: <sphinx.ext.autodoc.importer._MockObject object at 0x7f5c804d62e8>, previous_update: <sphinx.ext.autodoc.importer._MockObject object at 0x7f5c804d6828>, cols: list)[source]¶ Identify new records in an update data and a previous update data.
Based on a string of cols pasted together to form an identifier.
Example
Given cols = [‘country_territory_area’, ‘date_start’], pastes values in these columns together. Referred to as a “combo string”.
Any records in records with a “combo string” in previous_update will be not be recognised as a new record.
i.e. “United States of America_2020-01-01” == “United States of America_2020-01-01” means that records match.
Parameters: - records (pd.DataFrame) – Newly updated data.
- previous_update (pd.DataFrame) – Previously updated data.
- cols (list) – Columns to be considered when merging records.
Returns: New records not present in previous_update.
Return type: pd.DataFrame