Skip to content

Comments

gh-142889: Restructure PyDictKeysObject memory layout for simpler access#145097

Open
clintonsteiner wants to merge 4 commits intopython:mainfrom
clintonsteiner:gh-142889
Open

gh-142889: Restructure PyDictKeysObject memory layout for simpler access#145097
clintonsteiner wants to merge 4 commits intopython:mainfrom
clintonsteiner:gh-142889

Conversation

@clintonsteiner
Copy link

@clintonsteiner clintonsteiner commented Feb 22, 2026

Restructure dict keys allocation to store dk_indices before the PyDictKeysObject header and keep dk_entries after the header.

Update dict index access and related allocation/free/clone paths, adjust gdb dict entry location logic, and add layout coverage tests.

Local dict microbenchmarks showed about a 1.4% overall improvement, with most operations around 1-2% faster.

@python-cla-bot
Copy link

python-cla-bot bot commented Feb 22, 2026

All commit authors signed the Contributor License Agreement.

CLA signed

…er entry access

Restructure dict keys allocation to store dk_indices before the PyDictKeysObject header and keep dk_entries after the header.

Update dict index access and related allocation/free/clone paths, adjust gdb dict entry location logic, and add layout coverage tests.

Local dict microbenchmarks showed about a 1.4% overall improvement, with most operations around 1-2% faster.
@@ -16,7 +16,6 @@ As of Python 3.6, this is compact and ordered. Basic idea is described here:

layout:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this layout comment need updating (move dk_entries[] to the top and demark where a PyDictKeysObject * actually points to)?

Also, I feel like the definition for PyDictKeysObject (struct _dictkeysobject) needs updating:

struct _dictkeysobject {
Py_ssize_t dk_refcnt;
/* Size of the hash table (dk_indices). It must be a power of 2. */
uint8_t dk_log2_size;
/* Size of the hash table (dk_indices) by bytes. */
uint8_t dk_log2_index_bytes;
/* Kind of keys */
uint8_t dk_kind;
#ifdef Py_GIL_DISABLED
/* Lock used to protect shared keys */
PyMutex dk_mutex;
#endif
/* Version number -- Reset to 0 by any modification to keys */
uint32_t dk_version;
/* Number of usable entries in dk_entries. */
Py_ssize_t dk_usable;
/* Number of used entries in dk_entries. */
Py_ssize_t dk_nentries;
/* Actual hash table of dk_size entries. It holds indices in dk_entries,
or DKIX_EMPTY(-1) or DKIX_DUMMY(-2).
Indices must be: 0 <= indice < USABLE_FRACTION(dk_size).
The size in bytes of an indice depends on dk_size:
- 1 byte if dk_size <= 0xff (char*)
- 2 bytes if dk_size <= 0xffff (int16_t*)
- 4 bytes if dk_size <= 0xffffffff (int32_t*)
- 8 bytes otherwise (int64_t*)
Dynamically sized, SIZEOF_VOID_P is minimum. */
char dk_indices[]; /* char is required to avoid strict aliasing. */
/* "PyDictKeyEntry or PyDictUnicodeEntry dk_entries[USABLE_FRACTION(DK_SIZE(dk))];" array follows:
see the DK_ENTRIES() / DK_UNICODE_ENTRIES() functions below */
};

Probably just change char dk_indices[] to char dk_entries[] and add a comment saying that dk_indicies will be put above the header. I'm not sure about the consequences of renaming dk_indices though, so you could also just add a comment.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, was trying to keep diff small but agreed this is no longer applicable

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants