feast.infra package
Subpackages
- feast.infra.offline_stores package
- Subpackages
- feast.infra.offline_stores.contrib package
- Subpackages
- Submodules
- feast.infra.offline_stores.contrib.athena_repo_configuration module
- feast.infra.offline_stores.contrib.mssql_repo_configuration module
- feast.infra.offline_stores.contrib.postgres_repo_configuration module
- feast.infra.offline_stores.contrib.spark_repo_configuration module
- feast.infra.offline_stores.contrib.trino_repo_configuration module
- Module contents
- feast.infra.offline_stores.contrib package
- Submodules
- feast.infra.offline_stores.bigquery module
- feast.infra.offline_stores.bigquery_source module
- feast.infra.offline_stores.file module
- feast.infra.offline_stores.file_source module
- feast.infra.offline_stores.offline_store module
- feast.infra.offline_stores.offline_utils module
- feast.infra.offline_stores.redshift module
- feast.infra.offline_stores.redshift_source module
- feast.infra.offline_stores.snowflake module
- feast.infra.offline_stores.snowflake_source module
- Module contents
- Subpackages
- feast.infra.online_stores package
- Subpackages
- feast.infra.online_stores.contrib package
- Subpackages
- Submodules
- feast.infra.online_stores.contrib.cassandra_repo_configuration module
- feast.infra.online_stores.contrib.hbase_repo_configuration module
- feast.infra.online_stores.contrib.mysql_repo_configuration module
- feast.infra.online_stores.contrib.postgres module
- feast.infra.online_stores.contrib.postgres_repo_configuration module
- Module contents
- feast.infra.online_stores.contrib package
- Submodules
- feast.infra.online_stores.bigtable module
- feast.infra.online_stores.datastore module
- feast.infra.online_stores.dynamodb module
- feast.infra.online_stores.helpers module
- feast.infra.online_stores.online_store module
- feast.infra.online_stores.redis module
- feast.infra.online_stores.snowflake module
- feast.infra.online_stores.sqlite module
- Module contents
- Subpackages
- feast.infra.registry package
- feast.infra.transformation_servers package
- feast.infra.utils package
Submodules
feast.infra.aws module
- class feast.infra.aws.AwsProvider(config: feast.repo_config.RepoConfig)[source]
Bases:
feast.infra.passthrough_provider.PassthroughProvider
- get_feature_server_endpoint() Optional[str] [source]
Returns endpoint for the feature server, if it exists.
- teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity]) None [source]
Tears down all cloud resources for the specified set of Feast objects.
- Parameters
project – Feast project to which the objects belong.
tables – Feature views whose corresponding infrastructure should be deleted.
entities – Entities whose corresponding infrastructure should be deleted.
- update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]
Reconciles cloud resources with the specified set of Feast objects.
- Parameters
project – Feast project to which the objects belong.
tables_to_delete – Feature views whose corresponding infrastructure should be deleted.
tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.
entities_to_delete – Entities whose corresponding infrastructure should be deleted.
entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.
partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.
feast.infra.gcp module
- class feast.infra.gcp.GcpProvider(config: feast.repo_config.RepoConfig)[source]
Bases:
feast.infra.passthrough_provider.PassthroughProvider
This class only exists for backwards compatibility.
feast.infra.infra_object module
- class feast.infra.infra_object.Infra(infra_objects: List[feast.infra.infra_object.InfraObject] = <factory>)[source]
Bases:
object
Represents the set of infrastructure managed by Feast.
- Parameters
infra_objects – A list of InfraObjects, each representing one infrastructure object.
- classmethod from_proto(infra_proto: feast.core.InfraObject_pb2.Infra)[source]
Returns an Infra object created from a protobuf representation.
- infra_objects: List[feast.infra.infra_object.InfraObject]
- class feast.infra.infra_object.InfraObject(name: str)[source]
Bases:
abc.ABC
Represents a single infrastructure object (e.g. online store table) managed by Feast.
- abstract static from_infra_object_proto(infra_object_proto: feast.core.InfraObject_pb2.InfraObject) Any [source]
Returns an InfraObject created from a protobuf representation.
- Parameters
infra_object_proto – A protobuf representation of an InfraObject.
- Raises
FeastInvalidInfraObjectType – The type of InfraObject could not be identified.
- static from_proto(infra_object_proto: Any) Any [source]
Converts a protobuf representation of a subclass to an object of that subclass.
- Parameters
infra_object_proto – A protobuf representation of an InfraObject.
- Raises
FeastInvalidInfraObjectType – The type of InfraObject could not be identified.
feast.infra.key_encoding_utils module
- feast.infra.key_encoding_utils.serialize_entity_key(entity_key: feast.types.EntityKey_pb2.EntityKey, entity_key_serialization_version=1) bytes [source]
Serialize entity key to a bytestring so it can be used as a lookup key in a hash table.
We need this encoding to be stable; therefore we cannot just use protobuf serialization here since it does not guarantee that two proto messages containing the same data will serialize to the same byte string[1].
[1] https://developers.google.com/protocol-buffers/docs/encoding
- feast.infra.key_encoding_utils.serialize_entity_key_prefix(entity_keys: List[str]) bytes [source]
Serialize keys to a bytestring, so it can be used to prefix-scan through items stored in the online store using serialize_entity_key.
This encoding is a partial implementation of serialize_entity_key, only operating on the keys of entities, and not the values.
feast.infra.local module
- class feast.infra.local.LocalProvider(config: feast.repo_config.RepoConfig)[source]
Bases:
feast.infra.passthrough_provider.PassthroughProvider
This class only exists for backwards compatibility.
- plan_infra(config: feast.repo_config.RepoConfig, desired_registry_proto: feast.core.Registry_pb2.Registry) feast.infra.infra_object.Infra [source]
Returns the Infra required to support the desired registry.
- Parameters
config – The RepoConfig for the current FeatureStore.
desired_registry_proto – The desired registry, in proto form.
feast.infra.passthrough_provider module
- class feast.infra.passthrough_provider.PassthroughProvider(config: feast.repo_config.RepoConfig)[source]
Bases:
feast.infra.provider.Provider
The passthrough provider delegates all operations to the underlying online and offline stores.
- property batch_engine: feast.infra.materialization.batch_materialization_engine.BatchMaterializationEngine
- get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.infra.registry.base_registry.BaseRegistry, project: str, full_feature_names: bool) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Retrieves the point-in-time correct historical feature values for the specified entity rows.
- Parameters
config – The config for the current feature store.
feature_views – A list containing all feature views that are referenced in the entity rows.
feature_refs – The features to be retrieved.
entity_df – A collection of rows containing all entity columns on which features need to be joined, as well as the timestamp column used for point-in-time joins. Either a pandas dataframe can be provided or a SQL query.
registry – The registry for the current feature store.
project – Feast project to which the feature views belong.
full_feature_names – If True, feature names will be prefixed with the corresponding feature view name, changing them from the format “feature” to “feature_view__feature” (e.g. “daily_transactions” changes to “customer_fv__daily_transactions”).
- Returns
A RetrievalJob that can be executed to get the features.
- ingest_df(feature_view: feast.feature_view.FeatureView, df: pandas.core.frame.DataFrame)[source]
Persists a dataframe to the online store.
- Parameters
feature_view – The feature view to which the dataframe corresponds.
df – The dataframe to be persisted.
- ingest_df_to_offline_store(feature_view: feast.feature_view.FeatureView, table: pyarrow.lib.Table)[source]
Persists a dataframe to the offline store.
- Parameters
feature_view – The feature view to which the dataframe corresponds.
df – The dataframe to be persisted.
- materialize_single_feature_view(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, start_date: datetime.datetime, end_date: datetime.datetime, registry: feast.infra.registry.base_registry.BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm.std.tqdm]) None [source]
Writes latest feature values in the specified time range to the online store.
- Parameters
config – The config for the current feature store.
feature_view – The feature view to materialize.
start_date – The start of the time range.
end_date – The end of the time range.
registry – The registry for the current feature store.
project – Feast project to which the objects belong.
tqdm_builder – A function to monitor the progress of materialization.
- property offline_store
- offline_write_batch(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, data: pyarrow.lib.Table, progress: Optional[Callable[[int], Any]]) None [source]
- online_read(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, entity_keys: List[feast.types.EntityKey_pb2.EntityKey], requested_features: List[str] = None) List [source]
Reads features values for the given entity keys.
- Parameters
config – The config for the current feature store.
table – The feature view whose feature values should be read.
entity_keys – The list of entity keys for which feature values should be read.
requested_features – The list of features that should be read.
- Returns
A list of the same length as entity_keys. Each item in the list is a tuple where the first item is the event timestamp for the row, and the second item is a dict mapping feature names to values, which are returned in proto format.
- property online_store
- online_write_batch(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, data: List[Tuple[feast.types.EntityKey_pb2.EntityKey, Dict[str, feast.types.Value_pb2.Value], datetime.datetime, Optional[datetime.datetime]]], progress: Optional[Callable[[int], Any]]) None [source]
Writes a batch of feature rows to the online store.
If a tz-naive timestamp is passed to this method, it is assumed to be UTC.
- Parameters
config – The config for the current feature store.
table – Feature view to which these feature rows correspond.
data – A list of quadruplets containing feature data. Each quadruplet contains an entity key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.
progress – Function to be called once a batch of rows is written to the online store, used to show progress.
- retrieve_feature_service_logs(feature_service: feast.feature_service.FeatureService, start_date: datetime.datetime, end_date: datetime.datetime, config: feast.repo_config.RepoConfig, registry: feast.infra.registry.base_registry.BaseRegistry) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Reads logged features for the specified time window.
- Parameters
feature_service – The feature service whose logs should be retrieved.
start_date – The start of the window.
end_date – The end of the window.
config – The config for the current feature store.
registry – The registry for the current feature store.
- Returns
A RetrievalJob that can be executed to get the feature service logs.
- retrieve_saved_dataset(config: feast.repo_config.RepoConfig, dataset: feast.saved_dataset.SavedDataset) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Reads a saved dataset.
- Parameters
config – The config for the current feature store.
dataset – A SavedDataset object containing all parameters necessary for retrieving the dataset.
- Returns
A RetrievalJob that can be executed to get the saved dataset.
- teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity]) None [source]
Tears down all cloud resources for the specified set of Feast objects.
- Parameters
project – Feast project to which the objects belong.
tables – Feature views whose corresponding infrastructure should be deleted.
entities – Entities whose corresponding infrastructure should be deleted.
- update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]
Reconciles cloud resources with the specified set of Feast objects.
- Parameters
project – Feast project to which the objects belong.
tables_to_delete – Feature views whose corresponding infrastructure should be deleted.
tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.
entities_to_delete – Entities whose corresponding infrastructure should be deleted.
entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.
partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.
- write_feature_service_logs(feature_service: feast.feature_service.FeatureService, logs: Union[pyarrow.lib.Table, str], config: feast.repo_config.RepoConfig, registry: feast.infra.registry.base_registry.BaseRegistry)[source]
Writes features and entities logged by a feature server to the offline store.
The schema of the logs table is inferred from the specified feature service. Only feature services with configured logging are accepted.
- Parameters
feature_service – The feature service to be logged.
logs – The logs, either as an arrow table or as a path to a parquet directory.
config – The config for the current feature store.
registry – The registry for the current feature store.
feast.infra.provider module
- class feast.infra.provider.Provider(config: feast.repo_config.RepoConfig)[source]
Bases:
abc.ABC
A provider defines an implementation of a feature store object. It orchestrates the various components of a feature store, such as the offline store, online store, and materialization engine. It is configured through a RepoConfig object.
- get_feature_server_endpoint() Optional[str] [source]
Returns endpoint for the feature server, if it exists.
- abstract get_historical_features(config: feast.repo_config.RepoConfig, feature_views: List[feast.feature_view.FeatureView], feature_refs: List[str], entity_df: Union[pandas.core.frame.DataFrame, str], registry: feast.infra.registry.base_registry.BaseRegistry, project: str, full_feature_names: bool) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Retrieves the point-in-time correct historical feature values for the specified entity rows.
- Parameters
config – The config for the current feature store.
feature_views – A list containing all feature views that are referenced in the entity rows.
feature_refs – The features to be retrieved.
entity_df – A collection of rows containing all entity columns on which features need to be joined, as well as the timestamp column used for point-in-time joins. Either a pandas dataframe can be provided or a SQL query.
registry – The registry for the current feature store.
project – Feast project to which the feature views belong.
full_feature_names – If True, feature names will be prefixed with the corresponding feature view name, changing them from the format “feature” to “feature_view__feature” (e.g. “daily_transactions” changes to “customer_fv__daily_transactions”).
- Returns
A RetrievalJob that can be executed to get the features.
- ingest_df(feature_view: feast.feature_view.FeatureView, df: pandas.core.frame.DataFrame)[source]
Persists a dataframe to the online store.
- Parameters
feature_view – The feature view to which the dataframe corresponds.
df – The dataframe to be persisted.
- ingest_df_to_offline_store(feature_view: feast.feature_view.FeatureView, df: pyarrow.lib.Table)[source]
Persists a dataframe to the offline store.
- Parameters
feature_view – The feature view to which the dataframe corresponds.
df – The dataframe to be persisted.
- abstract materialize_single_feature_view(config: feast.repo_config.RepoConfig, feature_view: feast.feature_view.FeatureView, start_date: datetime.datetime, end_date: datetime.datetime, registry: feast.infra.registry.base_registry.BaseRegistry, project: str, tqdm_builder: Callable[[int], tqdm.std.tqdm]) None [source]
Writes latest feature values in the specified time range to the online store.
- Parameters
config – The config for the current feature store.
feature_view – The feature view to materialize.
start_date – The start of the time range.
end_date – The end of the time range.
registry – The registry for the current feature store.
project – Feast project to which the objects belong.
tqdm_builder – A function to monitor the progress of materialization.
- abstract online_read(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, entity_keys: List[feast.types.EntityKey_pb2.EntityKey], requested_features: Optional[List[str]] = None) List[Tuple[Optional[datetime.datetime], Optional[Dict[str, feast.types.Value_pb2.Value]]]] [source]
Reads features values for the given entity keys.
- Parameters
config – The config for the current feature store.
table – The feature view whose feature values should be read.
entity_keys – The list of entity keys for which feature values should be read.
requested_features – The list of features that should be read.
- Returns
A list of the same length as entity_keys. Each item in the list is a tuple where the first item is the event timestamp for the row, and the second item is a dict mapping feature names to values, which are returned in proto format.
- abstract online_write_batch(config: feast.repo_config.RepoConfig, table: feast.feature_view.FeatureView, data: List[Tuple[feast.types.EntityKey_pb2.EntityKey, Dict[str, feast.types.Value_pb2.Value], datetime.datetime, Optional[datetime.datetime]]], progress: Optional[Callable[[int], Any]]) None [source]
Writes a batch of feature rows to the online store.
If a tz-naive timestamp is passed to this method, it is assumed to be UTC.
- Parameters
config – The config for the current feature store.
table – Feature view to which these feature rows correspond.
data – A list of quadruplets containing feature data. Each quadruplet contains an entity key, a dict containing feature values, an event timestamp for the row, and the created timestamp for the row if it exists.
progress – Function to be called once a batch of rows is written to the online store, used to show progress.
- plan_infra(config: feast.repo_config.RepoConfig, desired_registry_proto: feast.core.Registry_pb2.Registry) feast.infra.infra_object.Infra [source]
Returns the Infra required to support the desired registry.
- Parameters
config – The RepoConfig for the current FeatureStore.
desired_registry_proto – The desired registry, in proto form.
- abstract retrieve_feature_service_logs(feature_service: feast.feature_service.FeatureService, start_date: datetime.datetime, end_date: datetime.datetime, config: feast.repo_config.RepoConfig, registry: feast.infra.registry.base_registry.BaseRegistry) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Reads logged features for the specified time window.
- Parameters
feature_service – The feature service whose logs should be retrieved.
start_date – The start of the window.
end_date – The end of the window.
config – The config for the current feature store.
registry – The registry for the current feature store.
- Returns
A RetrievalJob that can be executed to get the feature service logs.
- abstract retrieve_saved_dataset(config: feast.repo_config.RepoConfig, dataset: feast.saved_dataset.SavedDataset) feast.infra.offline_stores.offline_store.RetrievalJob [source]
Reads a saved dataset.
- Parameters
config – The config for the current feature store.
dataset – A SavedDataset object containing all parameters necessary for retrieving the dataset.
- Returns
A RetrievalJob that can be executed to get the saved dataset.
- abstract teardown_infra(project: str, tables: Sequence[feast.feature_view.FeatureView], entities: Sequence[feast.entity.Entity])[source]
Tears down all cloud resources for the specified set of Feast objects.
- Parameters
project – Feast project to which the objects belong.
tables – Feature views whose corresponding infrastructure should be deleted.
entities – Entities whose corresponding infrastructure should be deleted.
- abstract update_infra(project: str, tables_to_delete: Sequence[feast.feature_view.FeatureView], tables_to_keep: Sequence[feast.feature_view.FeatureView], entities_to_delete: Sequence[feast.entity.Entity], entities_to_keep: Sequence[feast.entity.Entity], partial: bool)[source]
Reconciles cloud resources with the specified set of Feast objects.
- Parameters
project – Feast project to which the objects belong.
tables_to_delete – Feature views whose corresponding infrastructure should be deleted.
tables_to_keep – Feature views whose corresponding infrastructure should not be deleted, and may need to be updated.
entities_to_delete – Entities whose corresponding infrastructure should be deleted.
entities_to_keep – Entities whose corresponding infrastructure should not be deleted, and may need to be updated.
partial – If true, tables_to_delete and tables_to_keep are not exhaustive lists, so infrastructure corresponding to other feature views should be not be touched.
- abstract write_feature_service_logs(feature_service: feast.feature_service.FeatureService, logs: Union[pyarrow.lib.Table, pathlib.Path], config: feast.repo_config.RepoConfig, registry: feast.infra.registry.base_registry.BaseRegistry)[source]
Writes features and entities logged by a feature server to the offline store.
The schema of the logs table is inferred from the specified feature service. Only feature services with configured logging are accepted.
- Parameters
feature_service – The feature service to be logged.
logs – The logs, either as an arrow table or as a path to a parquet directory.
config – The config for the current feature store.
registry – The registry for the current feature store.
- feast.infra.provider.get_provider(config: feast.repo_config.RepoConfig) feast.infra.provider.Provider [source]