feast.infra.offline_stores package
Submodules
feast.infra.offline_stores.bigquery module
- class feast.infra.offline_stores.bigquery.BigQueryOfflineStore[source]
Bases:
feast.infra.offline_stores.offline_store.OfflineStore
- static get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.Registry, project: str, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]
- static pull_all_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.
- static pull_latest_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, created_timestamp_column: Optional[str], start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.
- class feast.infra.offline_stores.bigquery.BigQueryOfflineStoreConfig(*, type: typing_extensions.Literal[bigquery] = 'bigquery', dataset: pydantic.types.StrictStr = 'feast', project_id: pydantic.types.StrictStr = None, location: pydantic.types.StrictStr = None)[source]
Bases:
feast.repo_config.FeastConfigBaseModel
Offline store config for GCP BigQuery
- dataset: pydantic.types.StrictStr
(optional) BigQuery Dataset name for temporary tables
- location: Optional[pydantic.types.StrictStr]
(optional) GCP location name used for the BigQuery offline store. Examples of location names include
US
,EU
,us-central1
,us-west4
. If a location is not specified, the location defaults to theUS
multi-regional location. For more information on BigQuery data locations see: https://cloud.google.com/bigquery/docs/locations
- project_id: Optional[pydantic.types.StrictStr]
(optional) GCP project name used for the BigQuery offline store
- type: typing_extensions.Literal[bigquery]
Offline store type selector
- class feast.infra.offline_stores.bigquery.BigQueryRetrievalJob(query: Union[str, Callable[[], AbstractContextManager[str]]], client: google.cloud.bigquery.client.Client, config: feast.repo_config.RepoConfig, full_feature_names: bool, on_demand_feature_views: Optional[List[feast.on_demand_feature_view.OnDemandFeatureView]] = None, metadata: Optional[feast.infra.offline_stores.offline_store.RetrievalMetadata] = None)[source]
Bases:
feast.infra.offline_stores.offline_store.RetrievalJob
- property metadata: Optional[feast.infra.offline_stores.offline_store.RetrievalMetadata]
Return metadata information about retrieval. Should be available even before materializing the dataset itself.
- property on_demand_feature_views: Optional[List[feast.on_demand_feature_view.OnDemandFeatureView]]
- persist(storage: feast.saved_dataset.SavedDatasetStorage)[source]
Run the retrieval and persist the results in the same offline store used for read.
- to_bigquery(job_config: Optional[google.cloud.bigquery.job.query.QueryJobConfig] = None, timeout: int = 1800, retry_cadence: int = 10) Optional[str] [source]
Triggers the execution of a historical feature retrieval query and exports the results to a BigQuery table. Runs for a maximum amount of time specified by the timeout parameter (defaulting to 30 minutes).
- Parameters
job_config – An optional bigquery.QueryJobConfig to specify options like destination table, dry run, etc.
timeout – An optional number of seconds for setting the time limit of the QueryJob.
retry_cadence – An optional number of seconds for setting how long the job should checked for completion.
- Returns
Returns the destination table name or returns None if job_config.dry_run is True.
- feast.infra.offline_stores.bigquery.block_until_done(client: google.cloud.bigquery.client.Client, bq_job: Union[google.cloud.bigquery.job.query.QueryJob, google.cloud.bigquery.job.load.LoadJob], timeout: int = 1800, retry_cadence: float = 1)[source]
Waits for bq_job to finish running, up to a maximum amount of time specified by the timeout parameter (defaulting to 30 minutes).
- Parameters
client – A bigquery.client.Client to monitor the bq_job.
bq_job – The bigquery.job.QueryJob that blocks until done runnning.
timeout – An optional number of seconds for setting the time limit of the job.
retry_cadence – An optional number of seconds for setting how long the job should checked for completion.
- Raises
BigQueryJobStillRunning exception if the function has blocked longer than 30 minutes. –
BigQueryJobCancelled exception to signify when that the job has been cancelled (i.e. from timeout or KeyboardInterrupt) –
feast.infra.offline_stores.file module
- class feast.infra.offline_stores.file.FileOfflineStore[source]
Bases:
feast.infra.offline_stores.offline_store.OfflineStore
- static get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.Registry, project: str, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]
- static pull_all_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.
- static pull_latest_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, created_timestamp_column: Optional[str], start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.
- class feast.infra.offline_stores.file.FileOfflineStoreConfig(*, type: typing_extensions.Literal[file] = 'file')[source]
Bases:
feast.repo_config.FeastConfigBaseModel
Offline store config for local (file-based) store
- type: typing_extensions.Literal[file]
Offline store type selector
- class feast.infra.offline_stores.file.FileRetrievalJob(evaluation_function: Callable, full_feature_names: bool, on_demand_feature_views: Optional[List[feast.on_demand_feature_view.OnDemandFeatureView]] = None, metadata: Optional[feast.infra.offline_stores.offline_store.RetrievalMetadata] = None)[source]
Bases:
feast.infra.offline_stores.offline_store.RetrievalJob
- property metadata: Optional[feast.infra.offline_stores.offline_store.RetrievalMetadata]
Return metadata information about retrieval. Should be available even before materializing the dataset itself.
- property on_demand_feature_views: Optional[List[feast.on_demand_feature_view.OnDemandFeatureView]]
feast.infra.offline_stores.helpers module
feast.infra.offline_stores.offline_store module
- class feast.infra.offline_stores.offline_store.OfflineStore[source]
Bases:
abc.ABC
OfflineStore is an object used for all interaction between Feast and the service used for offline storage of features.
- abstract static get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.registry.Registry, project: str, full_feature_names: bool = False) feast.infra.offline_stores.offline_store.RetrievalJob [source]
- abstract static pull_all_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.
- abstract static pull_latest_from_table_or_query(config: feast.repo_config.RepoConfig, data_source: feast.data_source.DataSource, join_key_columns: List[str], feature_name_columns: List[str], event_timestamp_column: str, created_timestamp_column: Optional[str], start_date: datetime.datetime, end_date: datetime.datetime) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Note that join_key_columns, feature_name_columns, event_timestamp_column, and created_timestamp_column have all already been mapped to column names of the source table and those column names are the values passed into this function.
- class feast.infra.offline_stores.offline_store.RetrievalJob[source]
Bases:
abc.ABC
RetrievalJob is used to manage the execution of a historical feature retrieval
- abstract property metadata: Optional[feast.infra.offline_stores.offline_store.RetrievalMetadata]
Return metadata information about retrieval. Should be available even before materializing the dataset itself.
- abstract property on_demand_feature_views: Optional[List[feast.on_demand_feature_view.OnDemandFeatureView]]
- abstract persist(storage: feast.saved_dataset.SavedDatasetStorage)[source]
Run the retrieval and persist the results in the same offline store used for read.
- class feast.infra.offline_stores.offline_store.RetrievalMetadata(features: List[str], keys: List[str], min_event_timestamp: Optional[datetime.datetime] = None, max_event_timestamp: Optional[datetime.datetime] = None)[source]
Bases:
object
- max_event_timestamp: Optional[datetime.datetime]
- min_event_timestamp: Optional[datetime.datetime]