feat: Optimize DynamoDB online store for improved latency#5889
feat: Optimize DynamoDB online store for improved latency#5889ntkathole merged 2 commits intofeast-dev:masterfrom
Conversation
275eac2 to
1892c9d
Compare
|
@robhowley mind taking a look? |
There was a problem hiding this comment.
Pull request overview
This PR optimizes the DynamoDB online store implementation to reduce feature serving latency through several performance improvements: dictionary-based response ordering, cached deserializer instances, entity ID computation caching, and improved default configuration values.
Changes:
- Replaced O(n log n) sorting with O(1) dictionary lookup in response processing
- Added cached TypeDeserializer and entity ID caching to reduce object instantiation overhead
- Enabled VPC endpoint support for async client via endpoint_url configuration
- Updated default configuration values for better performance (batch_size: 100, max_pool_connections: 50, adaptive retry mode, etc.)
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| sdk/python/feast/infra/online_stores/dynamodb.py | Core optimization changes including response processing refactor, caching implementations, VPC endpoint support, and updated configuration defaults |
| sdk/python/tests/unit/infra/online_store/test_dynamodb_online_store.py | Updated test assertions to reflect new default configuration values |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@ntkathole do you have any summary of performance changes for a fixed set of features or feature views? |
@franciscojavierarceo What I see with local testing is processing savings ~1.26ms (for 1000 entities with duplicates), while I expect config optimizations to have real network-level impact. |
1892c9d to
62417e8
Compare
| session_based_auth: bool = False | ||
| """AWS session based client authentication""" | ||
|
|
||
| max_pool_connections: int = 10 | ||
| """Max number of connections for async Dynamodb operations""" | ||
| max_pool_connections: int = 50 | ||
| """Max number of connections for async Dynamodb operations. | ||
| Increase for high-throughput workloads.""" | ||
|
|
||
| keepalive_timeout: float = 12.0 | ||
| """Keep-alive timeout in seconds for async Dynamodb connections.""" | ||
| keepalive_timeout: float = 30.0 | ||
| """Keep-alive timeout in seconds for async Dynamodb connections. | ||
| Higher values help reuse connections under sustained load.""" | ||
|
|
||
| connect_timeout: Union[int, float] = 60 | ||
| connect_timeout: Union[int, float] = 5 | ||
| """The time in seconds until a timeout exception is thrown when attempting to make | ||
| an async connection.""" | ||
| an async connection. Lower values enable faster failure detection.""" | ||
|
|
||
| read_timeout: Union[int, float] = 60 | ||
| read_timeout: Union[int, float] = 10 | ||
| """The time in seconds until a timeout exception is thrown when attempting to read | ||
| from an async connection.""" | ||
| from an async connection. Lower values enable faster failure detection.""" | ||
|
|
||
| total_max_retry_attempts: Union[int, None] = None | ||
| total_max_retry_attempts: Union[int, None] = 3 | ||
| """Maximum number of total attempts that will be made on a single request. | ||
|
|
||
| Maps to `retries.total_max_attempts` in botocore.config.Config. | ||
| """ | ||
|
|
||
| retry_mode: Union[Literal["legacy", "standard", "adaptive"], None] = None | ||
| retry_mode: Union[Literal["legacy", "standard", "adaptive"], None] = "adaptive" | ||
| """The type of retry mode (aio)botocore should use. | ||
|
|
||
| Maps to `retries.mode` in botocore.config.Config. | ||
| 'adaptive' mode provides intelligent retry with client-side rate limiting. | ||
| """ |
There was a problem hiding this comment.
(Refers to lines 56-106)
🚩 Breaking change in default configuration values may affect existing deployments
Several default values have changed significantly:
connect_timeout: 60s → 5s (12x reduction)read_timeout: 60s → 10s (6x reduction)total_max_retry_attempts: None → 3 (now bounded)batch_size: 40 → 100 (2.5x increase)
Existing deployments relying on defaults may experience different behavior. The lower timeouts will cause faster failure detection but could also cause more timeout errors in high-latency environments. The bounded retries could cause failures that previously would have eventually succeeded. These are reasonable production defaults but may require documentation/migration notes.
Was this helpful? React with 👍 or 👎 to provide feedback.
Signed-off-by: ntkathole <nikhilkathole2683@gmail.com>
62417e8 to
f1dfe10
Compare
…5889) Signed-off-by: ntkathole <nikhilkathole2683@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com> Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
…5889) Signed-off-by: ntkathole <nikhilkathole2683@gmail.com> Co-authored-by: Francisco Javier Arceo <arceofrancisco@gmail.com> Signed-off-by: yassinnouh21 <yassinnouh21@gmail.com>
# [0.60.0](v0.59.0...v0.60.0) (2026-02-17) ### Bug Fixes * Added a flag to correctly download the go binaries ([0f77135](0f77135)) * Adds mapping of date Trino's type into string Feast's type ([531e839](531e839)) * **ci:** Use uv run for pytest in master_only benchmark step ([#5957](#5957)) ([5096010](5096010)) * Disable materialized odfvs for historical retrieval ([#5880](#5880)) ([739d28a](739d28a)) * Fix linting and formatting issues ([#5907](#5907)) ([42ca14a](42ca14a)) * Make timestamp field handling compatible with Athena V3 ([#5936](#5936)) ([e2bad34](e2bad34)) * Support pgvector under non-default schema ([#5970](#5970)) ([c636cd4](c636cd4)) * unit tests not running on main branch ([#5909](#5909)) ([62fe664](62fe664)) * Update java dep which blocking release ([#5903](#5903)) ([a5b8186](a5b8186)) * Update the dockerfile with golang 1.24.12. ([#5918](#5918)) ([be1b522](be1b522)) * Use context.Background() in client constructors ([#5897](#5897)) ([984f93a](984f93a)) ### Features * Add blog post for PyTorch ecosystem announcement ([#5906](#5906)) ([d2eb629](d2eb629)) * Add blog post on Feast dbt integration ([#5915](#5915)) ([b3c8138](b3c8138)) * Add DynamoDB in-place list update support for array-based features ([#5916](#5916)) ([aa5973f](aa5973f)) * Add HTTP connection pooling for remote online store client ([#5895](#5895)) ([e022bf8](e022bf8)) * Add integration tests for dbt import ([#5899](#5899)) ([a444692](a444692)) * Add lazy initialization and feature service caching ([#5924](#5924)) ([b37b7d0](b37b7d0)) * Add multiple entity support to dbt integration ([#5901](#5901)) ([05a4fb5](05a4fb5)), closes [#5872](#5872) * Add PostgreSQL online store support for Go feature server ([#5963](#5963)) ([b8c6f3d](b8c6f3d)) * Add publish docker image of Go feature server. ([#5923](#5923)) ([759d8c6](759d8c6)) * Add Set as feature type ([#5888](#5888)) ([52458fc](52458fc)) * Added online server worker config support in operator ([#5926](#5926)) ([193c72a](193c72a)) * Added support for OpenLineage integration ([#5884](#5884)) ([df70d8d](df70d8d)) * Adjust ray offline store to support abfs(s) ADLS Azure Storage ([#5911](#5911)) ([d6c0b2d](d6c0b2d)) * Batch_engine config injection in feature_store.yaml through operator ([#5938](#5938)) ([455d56c](455d56c)) * Consolidate Python packaging - remove setup.py/setup.cfg, standardize on pyproject.toml and uv ([16696b8](16696b8)) * **go:** Add MySQL registry store support for Go feature server ([#5933](#5933)) ([19f9bb8](19f9bb8)) * Improve local dev experience with file-aware hooks and auto parallelization ([#5956](#5956)) ([839b79e](839b79e)) * Modernize precommit hooks and optimize test performance ([#5929](#5929)) ([ea7d4fa](ea7d4fa)) * Optimize container infrastructure for production ([#5881](#5881)) ([5ebdac8](5ebdac8)) * Optimize DynamoDB online store for improved latency ([#5889](#5889)) ([fcc8274](fcc8274))
# [0.60.0](feast-dev/feast@v0.59.0...v0.60.0) (2026-02-17) ### Bug Fixes * Added a flag to correctly download the go binaries ([0f77135](feast-dev@0f77135)) * Adds mapping of date Trino's type into string Feast's type ([531e839](feast-dev@531e839)) * **ci:** Use uv run for pytest in master_only benchmark step ([feast-dev#5957](feast-dev#5957)) ([5096010](feast-dev@5096010)) * Disable materialized odfvs for historical retrieval ([feast-dev#5880](feast-dev#5880)) ([739d28a](feast-dev@739d28a)) * Fix linting and formatting issues ([feast-dev#5907](feast-dev#5907)) ([42ca14a](feast-dev@42ca14a)) * Make timestamp field handling compatible with Athena V3 ([feast-dev#5936](feast-dev#5936)) ([e2bad34](feast-dev@e2bad34)) * Support pgvector under non-default schema ([feast-dev#5970](feast-dev#5970)) ([c636cd4](feast-dev@c636cd4)) * unit tests not running on main branch ([feast-dev#5909](feast-dev#5909)) ([62fe664](feast-dev@62fe664)) * Update java dep which blocking release ([feast-dev#5903](feast-dev#5903)) ([a5b8186](feast-dev@a5b8186)) * Update the dockerfile with golang 1.24.12. ([feast-dev#5918](feast-dev#5918)) ([be1b522](feast-dev@be1b522)) * Use context.Background() in client constructors ([feast-dev#5897](feast-dev#5897)) ([984f93a](feast-dev@984f93a)) ### Features * Add blog post for PyTorch ecosystem announcement ([feast-dev#5906](feast-dev#5906)) ([d2eb629](feast-dev@d2eb629)) * Add blog post on Feast dbt integration ([feast-dev#5915](feast-dev#5915)) ([b3c8138](feast-dev@b3c8138)) * Add DynamoDB in-place list update support for array-based features ([feast-dev#5916](feast-dev#5916)) ([aa5973f](feast-dev@aa5973f)) * Add HTTP connection pooling for remote online store client ([feast-dev#5895](feast-dev#5895)) ([e022bf8](feast-dev@e022bf8)) * Add integration tests for dbt import ([feast-dev#5899](feast-dev#5899)) ([a444692](feast-dev@a444692)) * Add lazy initialization and feature service caching ([feast-dev#5924](feast-dev#5924)) ([b37b7d0](feast-dev@b37b7d0)) * Add multiple entity support to dbt integration ([feast-dev#5901](feast-dev#5901)) ([05a4fb5](feast-dev@05a4fb5)), closes [feast-dev#5872](feast-dev#5872) * Add PostgreSQL online store support for Go feature server ([feast-dev#5963](feast-dev#5963)) ([b8c6f3d](feast-dev@b8c6f3d)) * Add publish docker image of Go feature server. ([feast-dev#5923](feast-dev#5923)) ([759d8c6](feast-dev@759d8c6)) * Add Set as feature type ([feast-dev#5888](feast-dev#5888)) ([52458fc](feast-dev@52458fc)) * Added online server worker config support in operator ([feast-dev#5926](feast-dev#5926)) ([193c72a](feast-dev@193c72a)) * Added support for OpenLineage integration ([feast-dev#5884](feast-dev#5884)) ([df70d8d](feast-dev@df70d8d)) * Adjust ray offline store to support abfs(s) ADLS Azure Storage ([feast-dev#5911](feast-dev#5911)) ([d6c0b2d](feast-dev@d6c0b2d)) * Batch_engine config injection in feature_store.yaml through operator ([feast-dev#5938](feast-dev#5938)) ([455d56c](feast-dev@455d56c)) * Consolidate Python packaging - remove setup.py/setup.cfg, standardize on pyproject.toml and uv ([16696b8](feast-dev@16696b8)) * **go:** Add MySQL registry store support for Go feature server ([feast-dev#5933](feast-dev#5933)) ([19f9bb8](feast-dev@19f9bb8)) * Improve local dev experience with file-aware hooks and auto parallelization ([feast-dev#5956](feast-dev#5956)) ([839b79e](feast-dev@839b79e)) * Modernize precommit hooks and optimize test performance ([feast-dev#5929](feast-dev#5929)) ([ea7d4fa](feast-dev@ea7d4fa)) * Optimize container infrastructure for production ([feast-dev#5881](feast-dev#5881)) ([5ebdac8](feast-dev@5ebdac8)) * Optimize DynamoDB online store for improved latency ([feast-dev#5889](feast-dev#5889)) ([fcc8274](feast-dev@fcc8274)) Signed-off-by: soojin <soojin@dable.io>
What this PR does / why we need it:
Optimizes the DynamoDB online store implementation to reduce online feature serving latency.
O(1) dictionary lookup instead of O(n log n) sorting -
_process_batch_get_responsenow uses dictionary-based lookup for response ordering instead of sorting.Cached TypeDeserializer - Added class-level cached
TypeDeserializerinstance to avoid per-request object instantiation overhead in async reads.VPC endpoint support for async client - The
endpoint_urlconfig is now properly passed to the async aiobotocore client, enabling DynamoDB VPC endpoints for reduced network latency.Improved default configuration values:
batch_size: 40 → 100 (max allowed by DynamoDB BatchGetItem)max_pool_connections: 10 → 50 (better concurrency)keepalive_timeout: 12s → 30s (better connection reuse)connect_timeout: 60s → 5s (faster failure detection)read_timeout: 60s → 10s (faster failure detection)total_max_retry_attempts: None → 3 (bounded retries)retry_mode: None → "adaptive" (smart retry with rate limiting)Pre-allocated result lists - Response processing now pre-allocates result lists instead of using append-based growth.