gcc/libstdc++-v3
Jonathan Wakely 84e39b0750
libstdc++: Add _Hashtable::_M_locate(const key_type&)
We have two overloads of _M_find_before_node but they have quite
different performance characteristics, which isn't necessarily obvious.

The original version, _M_find_before_node(bucket, key, hash_code), looks
only in the specified bucket, doing a linear search within that bucket
for an element that compares equal to the key. This is the typical fast
lookup for hash containers, assuming the load factor is low so that each
bucket isn't too large.

The newer _M_find_before_node(key) was added in r12-6272-ge3ef832a9e8d6a
and could be naively assumed to calculate the hash code and bucket for
key and then call the efficient _M_find_before_node(bkt, key, code)
function. But in fact it does a linear search of the entire container.
This is potentially very slow and should only be used for a suitably
small container, as determined by the __small_size_threshold() function.
We don't even have a comment pointing out this O(N) performance of the
newer overload.

Additionally, the newer overload is only ever used in exactly one place,
which would suggest it could just be removed. However there are several
places that do the linear search of the whole container with an explicit
loop each time.

This adds a new member function, _M_locate, and uses it to replace most
uses of _M_find_node and the loops doing linear searches. This new
member function does both forms of lookup, the linear search for small
sizes and the _M_find_node(bkt, key, code) lookup within a single
bucket. The new function returns a __location_type which is a struct
that contains a pointer to the first node matching the key (if such a
node is present), or the hash code and bucket index for the key. The
hash code and bucket index allow the caller to know where a new node
with that key should be inserted, for the cases where the lookup didn't
find a matching node.

The result struct actually contains a pointer to the node *before* the
one that was located, as that is needed for it to be useful in erase and
extract members. There is a member function that returns the found node,
i.e. _M_before->_M_nxt downcast to __node_ptr, which should be used in
most cases.

This new function greatly simplifies the functions that currently have
to do two kinds of lookup and explicitly check the current size against
the small size threshold.

Additionally, now that try_emplace is defined directly in _Hashtable
(not in _Insert_base) we can use _M_locate in there too, to speed up
some try_emplace calls. Previously it did not do the small-size linear
search.

It would be possible to add a function to get a __location_type from an
iterator, and then rewrite some functions like _M_erase and
_M_extract_node to take a __location_type parameter. While that might be
conceptually nice, it wouldn't really make the code any simpler or more
readable than it is now. That isn't done in this change.

libstdc++-v3/ChangeLog:

	* include/bits/hashtable.h (__location_type): New struct.
	(_M_locate): New member function.
	(_M_find_before_node(const key_type&)): Remove.
	(_M_find_node): Move variable initialization into condition.
	(_M_find_node_tr): Likewise.
	(operator=(initializer_list<T>), try_emplace, _M_reinsert_node)
	(_M_merge_unique, find, erase(const key_type&)): Use _M_locate
	for lookup.
2024-11-13 20:21:41 +00:00
..
config aarch64: libstdc++: Use shufflevector instead of shuffle in opt_random.h 2024-10-24 15:01:23 +01:00
doc libstdc++: Deprecate useless <cxxx> compatibility headers for C++17 2024-11-06 12:47:19 +00:00
include libstdc++: Add _Hashtable::_M_locate(const key_type&) 2024-11-13 20:21:41 +00:00
libsupc++ ibstdc++: Add some further attributes to ::operator new in <new> 2024-11-08 22:07:33 +01:00
po
python libstdc++: Fix Python deprecation warning in printers.py 2024-10-16 10:09:16 +01:00
scripts libstdc++: Write timestamp to libstdc++-performance.sum file 2024-11-13 20:21:29 +00:00
src libstdc++: Enable debug assertions for filesystem directory iterators 2024-11-06 12:47:18 +00:00
testsuite libstdc++: Simplify _Hashtable merge functions 2024-11-13 20:21:41 +00:00
.editorconfig libstdc++: Add .editorconfig files 2024-09-16 10:10:23 +01:00
acinclude.m4 libstdc++: #ifdef out #pragma GCC system_header 2024-09-25 08:20:45 -04:00
aclocal.m4
ChangeLog Daily bump. 2024-11-12 00:19:15 +00:00
ChangeLog-1998
ChangeLog-1999
ChangeLog-2000
ChangeLog-2001
ChangeLog-2002
ChangeLog-2003
ChangeLog-2004
ChangeLog-2005
ChangeLog-2006
ChangeLog-2007
ChangeLog-2008
ChangeLog-2009
ChangeLog-2010
ChangeLog-2011
ChangeLog-2012
ChangeLog-2013
ChangeLog-2014
ChangeLog-2015
ChangeLog-2016
ChangeLog-2017
ChangeLog-2018
ChangeLog-2019
ChangeLog-2020
ChangeLog-2021
ChangeLog-2022
ChangeLog-2023
config.h.in libstdc++: Fix autoconf check for O_NONBLOCK in <fcntl.h> 2024-08-28 21:34:22 +01:00
configure libstdc++: #ifdef out #pragma GCC system_header 2024-09-25 08:20:45 -04:00
configure.ac libstdc++: Fix autoconf check for O_NONBLOCK in <fcntl.h> 2024-08-28 21:34:22 +01:00
configure.host
crossconfig.m4
fragment.am
linkage.m4
Makefile.am
Makefile.in
README

file: libstdc++-v3/README

New users may wish to point their web browsers to the file
index.html in the 'doc/html' subdirectory.  It contains brief
building instructions and notes on how to configure the library in
interesting ways.