ConcurrentFixedSwissTable is based on Google’s SwissTable, utilizing control bytes to implement fine-grained spinlocks at the slot level, specifically designed for high-concurrency lookup and insertion operations. However, it lacks support for deletion and automatic rehashing.
ConcurrentTransientHashSet and ConcurrentTransientHashMap build upon ConcurrentFixedSwissTable, using a simple lock-and-copy mechanism to provide automatic rehashing capabilities, enhancing their practicality. As indicated by their names, these structures do not support deletion and are intended for short-lived concurrent constructions and lookups, making it easy to clear them when no longer needed.

#include <babylon/concurrent/transient_hash_table.h>
using ::babylon::ConcurrentTransientHashSet;
using ::babylon::ConcurrentTransientHashMap;
// The default initial capacity is very small (16 or 32). For single-use, specify an appropriate size based on the scenario.
// For repeated use, `clear` will retain the previous capacity.
ConcurrentTransientHashSet<::std::string> set;
ConcurrentTransientHashSet<::std::string> set(1024);
ConcurrentTransientHashMap<::std::string, ::std::string> map;
ConcurrentTransientHashMap<::std::string, ::std::string> map(1024);
// Lookup and insertion are thread-safe
set.emplace("10086");
map.emplace("10086", "10010");
set.find("10086"); // != set.end();
map.find("10086"); // != map.end();
// Traversal is possible
for (auto& value : set) {
    // value == "10086"
}
for (auto& pair : map) {
    // pair.first == "10086"
    // pair.second == "10010"
}
// Clear for the next reuse
set.clear();
map.clear();
The benchmark code can be found in bench/bench_concurrent_hash_table.cpp. It randomly generates 1 million uint64_t data, forming a specified duplication rate and evenly distributing it among multiple threads for concurrent insertion. It evaluates the number of operations per second (single-threaded) and CPU usage per 1,000 operations. The performance of several typical open-source concurrent hash tables is compared in pure concurrent insertion scenarios, including TBB, Folly, and highly-rated personal libraries such as parallel-hashmap and libcuckoo.
| Threads | 1 | 4 | 16 | 1 | 4 | 16 | ||
|---|---|---|---|---|---|---|---|---|
| QPS | QPS | QPS | QPS | QPS | CPU | QPS | CPU | |
tbb::concurrent_unordered_set | 
      2136 | 2923 (730) | 4065 (254) | 1760 | 2380 (595) | 1.096 | 4651 (290) | 1.33 | 
tbb::concurrent_hash_set | 
      5154 | 4255 (1063) | 4807 (300) | 5181 | 3546 (221) | 0.828 | 4566 (285) | 2.03 | 
phmap::parallel_flat_hash_set | 
      27855 | 14224 (3556) | 14814 (925) | 18148 | 12870 (3217) | 0.294 | 15455 (965) | 0.961 | 
folly::ConcurrentHashMap | 
      4310 | 10952 (2738) | 34129 (2133) | 4132 | 11723 (2930) | 0.124 | 36231 (2264) | 0.049 | 
libcuckoo::cuckoohash_map | 
      4424 | 8928 (2232) | 31746 (1984) | 5263 | 14792 (3698) | 0.223 | 55248 (3453) | 0.166 | 
folly::AtomicUnorderedInsertMap | 
      5847 | 9090 (2272) | 29673 (1854) | 10460 | 29761 (7440) | 0.095 | 62111 (3881) | 0.116 | 
folly::AtomicHashMap | 
      7936 | 19455 (4863) | 61349 (3834) | 6896 | 20366 (5091) | 0.145 | 65789 (4111) | 0.110 | 
babylon::ConcurrentTransientHashSet | 
      24509 | 39062 (9765) | 129870 (8116) | 17825 | 36231 (9057) | 0.096 | 125944 (7871) | 0.110 | 
| Threads | 1 | 4 | 16 | 1 | 4 | 16 | ||
|---|---|---|---|---|---|---|---|---|
| QPS | QPS | QPS | QPS | QPS | CPU | QPS | CPU | |
tbb::concurrent_unordered_set | 
      2962 | 3921 (980) | 6711 (419) | 2518 | 4671 (1168) | 0.584 | 5988 (374) | 1.42 | 
tbb::concurrent_hash_set | 
      5882 | 6211 (1552) | 8333 (820) | 5847 | 6172 (1543) | 0.506 | 8333 (520) | 1.21 | 
phmap::parallel_flat_hash_set | 
      32467 | 21929 (5482) | 22172 (1385) | 30769 | 22222 (5555) | 0.172 | 23364 (1460) | 0.643 | 
folly::ConcurrentHashMap | 
      4405 | 10526 (2631) | 33003 (2062) | 4149 | 10752 (2688) | 0.178 | 34482 (2155) | 0.073 | 
libcuckoo::cuckoohash_map | 
      5952 | 12804 (3201) | 46511 (2906) | 6849 | 16949 (4237) | 0.203 | 61349 (3834) | 0.174 | 
folly::AtomicUnorderedInsertMap | 
      12690 | 27700 (6925) | 54644 (3415) | 17857 | 45248 (11312) | 0.068 | 72992 (4562) | 0.134 | 
folly::AtomicHashMap | 
      11025 | 27397 (6849) | 88495 (5530) | 11198 | 30769 (7692) | 0.102 | 99009 (6188) | 0.081 | 
babylon::ConcurrentTransientHashSet | 
      29239 | 49504 (12376) | 166666 (10416) | 33222 | 52910 (13227) | 0.069 | 172413 (10775) | 0.081 | 
| Threads | 1 | 4 | 16 | 1 | 4 | 16 | ||
|---|---|---|---|---|---|---|---|---|
| QPS | QPS | QPS | QPS | QPS | CPU | QPS | CPU | |
tbb::concurrent_unordered_set | 
      7633 | 16666 (4166) | 30211 (1888) | 7751 | 15847 (3961) | 0.224 | 25125 (1570) | 0.498 | 
tbb::concurrent_hash_set | 
      12722 | 18903 (4729) | 43290 (2705) | 14164 | 20120 (5030) | 0.181 | 61728 (3858) | 0.181 | 
folly::ConcurrentHashMap | 
      7812 | 16447 (4111) | 51282 (3205) | ```markdown | ||||
| 8474 | 16806 (4201) | 0.174 | 51282 (3205) | 0.129 | ||||
libcuckoo::cuckoohash_map | 
      10460 | 20325 (5081) | 72992 (4562) | 9900 | 21052 (5263) | 0.171 | 75757 (4734) | 0.171 | 
folly::AtomicUnorderedInsertMap | 
      36900 | 91743 (22935) | 85470 (5341) | 48309 | 129198 (32299) | 0.026 | 90909 (5681) | 0.155 | 
phmap::parallel_flat_hash_set | 
      61349 | 80645 (20161) | 135501 (8468) | 79365 | 96153 (24038) | 0.040 | 123456 (7716) | 0.123 | 
folly::AtomicHashMap | 
      18348 | 57471 (14367) | 204081 (12755) | 23923 | 78740 (19685) | 0.046 | 264550 (16534) | 0.041 | 
babylon::ConcurrentTransientHashSet | 
      59523 | 133333 (33333) | 401606 (25100) | 80645 | 173010 (43252) | 0.022 | 502512 (31407) | 0.025 | 
phmap, which wraps the Swiss table with simple sharded locks, performs decently but shows significant performance degradation as concurrency increases.folly::ConcurrentHashMap offers stable iterator functionality with deletion support through hazard pointers, maintaining good concurrency capabilities, making it quite versatile. CMU’s libcuckoo provides a surprisingly good engineering implementation with concurrent performance exceeding that of Folly when deletion is not a concern.babylon::ConcurrentTransientHashSet supports arbitrary keys while further enhancing throughput.
```