How to retrieve an element from a set without removing it?

Suppose the following:

>>> s = set([1, 2, 3])

How do I get a value (any value) out of s without doing s.pop()? I want to leave the item in the set until I am sure I can remove it - something I can only be sure of after an asynchronous call to another host.

Quick and dirty:

>>> elem = s.pop()
>>> s.add(elem)

But do you know of a better way? Ideally in constant time.


Asked by: Audrey885 | Posted: 28-01-2022






Answer 1

Two options that don't require copying the whole set:

for e in s:
    break
# e is now an element from s

Or...

e = next(iter(s))

But in general, sets don't support indexing or slicing.

Answered by: Sienna960 | Posted: 01-03-2022



Answer 2

Least code would be:

>>> s = set([1, 2, 3])
>>> list(s)[0]
1

Obviously this would create a new list which contains each member of the set, so not great if your set is very large.

Answered by: Charlie550 | Posted: 01-03-2022



Answer 3

I wondered how the functions will perform for different sets, so I did a benchmark:

from random import sample

def ForLoop(s):
    for e in s:
        break
    return e

def IterNext(s):
    return next(iter(s))

def ListIndex(s):
    return list(s)[0]

def PopAdd(s):
    e = s.pop()
    s.add(e)
    return e

def RandomSample(s):
    return sample(s, 1)

def SetUnpacking(s):
    e, *_ = s
    return e

from simple_benchmark import benchmark

b = benchmark([ForLoop, IterNext, ListIndex, PopAdd, RandomSample, SetUnpacking],
              {2**i: set(range(2**i)) for i in range(1, 20)},
              argument_name='set size',
              function_aliases={first: 'First'})

b.plot()

enter image description here

This plot clearly shows that some approaches (RandomSample, SetUnpacking and ListIndex) depend on the size of the set and should be avoided in the general case (at least if performance might be important). As already shown by the other answers the fastest way is ForLoop.

However as long as one of the constant time approaches is used the performance difference will be negligible.


iteration_utilities (Disclaimer: I'm the author) contains a convenience function for this use-case: first:

>>> from iteration_utilities import first
>>> first({1,2,3,4})
1

I also included it in the benchmark above. It can compete with the other two "fast" solutions but the difference isn't much either way.

Answered by: Ada178 | Posted: 01-03-2022



Answer 4

tl;dr

for first_item in muh_set: break remains the optimal approach in Python 3.x. Curse you, Guido.

y u do this

Welcome to yet another set of Python 3.x timings, extrapolated from wr.'s excellent Python 2.x-specific response. Unlike AChampion's equally helpful Python 3.x-specific response, the timings below also time outlier solutions suggested above – including:

Code Snippets for Great Joy

Turn on, tune in, time it:

from timeit import Timer

stats = [
    "for i in range(1000): \n\tfor x in s: \n\t\tbreak",
    "for i in range(1000): next(iter(s))",
    "for i in range(1000): s.add(s.pop())",
    "for i in range(1000): list(s)[0]",
    "for i in range(1000): random.sample(s, 1)",
]

for stat in stats:
    t = Timer(stat, setup="import random\ns=set(range(100))")
    try:
        print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
    except:
        t.print_exc()

Quickly Obsoleted Timeless Timings

Behold! Ordered by fastest to slowest snippets:

$ ./test_get.py
Time for for i in range(1000): 
    for x in s: 
        break:   0.249871
Time for for i in range(1000): next(iter(s)):    0.526266
Time for for i in range(1000): s.add(s.pop()):   0.658832
Time for for i in range(1000): list(s)[0]:   4.117106
Time for for i in range(1000): random.sample(s, 1):  21.851104

Faceplants for the Whole Family

Unsurprisingly, manual iteration remains at least twice as fast as the next fastest solution. Although the gap has decreased from the Bad Old Python 2.x days (in which manual iteration was at least four times as fast), it disappoints the PEP 20 zealot in me that the most verbose solution is the best. At least converting a set into a list just to extract the first element of the set is as horrible as expected. Thank Guido, may his light continue to guide us.

Surprisingly, the RNG-based solution is absolutely horrible. List conversion is bad, but random really takes the awful-sauce cake. So much for the Random Number God.

I just wish the amorphous They would PEP up a set.get_first() method for us already. If you're reading this, They: "Please. Do something."

Answered by: Emma602 | Posted: 01-03-2022



Answer 5

To provide some timing figures behind the different approaches, consider the following code. The get() is my custom addition to Python's setobject.c, being just a pop() without removing the element.

from timeit import *

stats = ["for i in xrange(1000): iter(s).next()   ",
         "for i in xrange(1000): \n\tfor x in s: \n\t\tbreak",
         "for i in xrange(1000): s.add(s.pop())   ",
         "for i in xrange(1000): s.get()          "]

for stat in stats:
    t = Timer(stat, setup="s=set(range(100))")
    try:
        print "Time for %s:\t %f"%(stat, t.timeit(number=1000))
    except:
        t.print_exc()

The output is:

$ ./test_get.py
Time for for i in xrange(1000): iter(s).next()   :       0.433080
Time for for i in xrange(1000):
        for x in s:
                break:   0.148695
Time for for i in xrange(1000): s.add(s.pop())   :       0.317418
Time for for i in xrange(1000): s.get()          :       0.146673

This means that the for/break solution is the fastest (sometimes faster than the custom get() solution).

Answered by: Melissa176 | Posted: 01-03-2022



Answer 6

Since you want a random element, this will also work:

>>> import random
>>> s = set([1,2,3])
>>> random.sample(s, 1)
[2]

The documentation doesn't seem to mention performance of random.sample. From a really quick empirical test with a huge list and a huge set, it seems to be constant time for a list but not for the set. Also, iteration over a set isn't random; the order is undefined but predictable:

>>> list(set(range(10))) == range(10)
True 

If randomness is important and you need a bunch of elements in constant time (large sets), I'd use random.sample and convert to a list first:

>>> lst = list(s) # once, O(len(s))?
...
>>> e = random.sample(lst, 1)[0] # constant time

Answered by: Richard212 | Posted: 01-03-2022



Answer 7

Seemingly the most compact (6 symbols) though very slow way to get a set element (made possible by PEP 3132):

e,*_=s

With Python 3.5+ you can also use this 7-symbol expression (thanks to PEP 448):

[*s][0]

Both options are roughly 1000 times slower on my machine than the for-loop method.

Answered by: Agata184 | Posted: 01-03-2022



Answer 8

Yet another way in Python 3:

next(iter(s))

or

s.__iter__().__next__()

Answered by: Chester346 | Posted: 01-03-2022



Answer 9

I use a utility function I wrote. Its name is somewhat misleading because it kind of implies it might be a random item or something like that.

def anyitem(iterable):
    try:
        return iter(iterable).next()
    except StopIteration:
        return None

Answered by: Victoria397 | Posted: 01-03-2022



Answer 10

Following @wr. post, I get similar results (for Python3.5)

from timeit import *

stats = ["for i in range(1000): next(iter(s))",
         "for i in range(1000): \n\tfor x in s: \n\t\tbreak",
         "for i in range(1000): s.add(s.pop())"]

for stat in stats:
    t = Timer(stat, setup="s=set(range(100000))")
    try:
        print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
    except:
        t.print_exc()

Output:

Time for for i in range(1000): next(iter(s)):    0.205888
Time for for i in range(1000): 
    for x in s: 
        break:                                   0.083397
Time for for i in range(1000): s.add(s.pop()):   0.226570

However, when changing the underlying set (e.g. call to remove()) things go badly for the iterable examples (for, iter):

from timeit import *

stats = ["while s:\n\ta = next(iter(s))\n\ts.remove(a)",
         "while s:\n\tfor x in s: break\n\ts.remove(x)",
         "while s:\n\tx=s.pop()\n\ts.add(x)\n\ts.remove(x)"]

for stat in stats:
    t = Timer(stat, setup="s=set(range(100000))")
    try:
        print("Time for %s:\t %f"%(stat, t.timeit(number=1000)))
    except:
        t.print_exc()

Results in:

Time for while s:
    a = next(iter(s))
    s.remove(a):             2.938494
Time for while s:
    for x in s: break
    s.remove(x):             2.728367
Time for while s:
    x=s.pop()
    s.add(x)
    s.remove(x):             0.030272

Answered by: Joyce431 | Posted: 01-03-2022



Answer 11

What I usually do for small collections is to create kind of parser/converter method like this

def convertSetToList(setName):
return list(setName)

Then I can use the new list and access by index number

userFields = convertSetToList(user)
name = request.json[userFields[0]]

As a list you will have all the other methods that you may need to work with

Answered by: Aida144 | Posted: 01-03-2022



Answer 12

You can unpack the values to access the elements:

s = set([1, 2, 3])

v1, v2, v3 = s

print(v1,v2,v3)
#1 2 3

Answered by: Ada320 | Posted: 01-03-2022



Answer 13

I f you want just the first element try this: b = (a-set()).pop()

Answered by: Daisy885 | Posted: 01-03-2022



Answer 14

How about s.copy().pop()? I haven't timed it, but it should work and it's simple. It works best for small sets however, as it copies the whole set.

Answered by: John975 | Posted: 01-03-2022



Answer 15

Another option is to use a dictionary with values you don't care about. E.g.,


poor_man_set = {}
poor_man_set[1] = None
poor_man_set[2] = None
poor_man_set[3] = None
...

You can treat the keys as a set except that they're just an array:


keys = poor_man_set.keys()
print "Some key = %s" % keys[0]

A side effect of this choice is that your code will be backwards compatible with older, pre-set versions of Python. It's maybe not the best answer but it's another option.

Edit: You can even do something like this to hide the fact that you used a dict instead of an array or set:


poor_man_set = {}
poor_man_set[1] = None
poor_man_set[2] = None
poor_man_set[3] = None
poor_man_set = poor_man_set.keys()

Answered by: Gianna802 | Posted: 01-03-2022



Similar questions

html - How can I retrieve the page title of a webpage using Python?

How can I retrieve the page title of a webpage (title html tag) using Python?


sql server - Python: Retrieve Image from MSSQL

I'm working on a Python project that retrieves an image from MSSQL. My code is able to retrieve the images successfully but with a fixed size of 63KB. if the image is greater than that size, it just brings the first 63KB from the image! The following is my code: #!/usr/bin/python import _mssql mssql=_mssql.connect('<ServerIP>','<UserID>','<Password>') mssql.select_db('<Database...


python - Best way to retrieve variable values from a text file?

Referring on this question, I have a similar -but not the same- problem.. On my way, I'll have some text file, structured like: var_a: 'home' var_b: 'car' var_c: 15.5 And I need that python read the file and then create a variable named var_a with value 'home', and so on. Example...


python - How to retrieve the selected text from the active window

I am trying to create a simple open source utility for windows using Python that can perform user-defined actions on the selected text of the currently active window. The utility should be activated using a pre-defined keyboard shortcut. Usage is partially outlined in the following example: The user selects some text using the mouse or the keyboard (in any application window)


python - How can I retrieve last x elements in Django

I am trying to retrieve the latest 5 posts (by post time) In the views.py, if I try blog_post_list = blogPosts.objects.all()[:5] It retreives the first 5 elements of the blogPosts objects, how can I reverse this to retreive the latest ones? Cheers


python - Retrieve module object from stack frame

Given a frame object, I need to get the corresponding module object. In other words, implement callers_module so this works: import sys from some_other_module import callers_module assert sys.modules[__name__] is callers_module() (That would be equivalent because I can generate a stack trace in the function for this test case. The imports are there simply to make that example complete an...


How do I retrieve Hotmail contacts with python

How can I retrieve contacts from hotmail with python? Is there any example?


linux - How to retrieve the process start time (or uptime) in python

How to retrieve the process start time (or uptime) in python in Linux? I only know, I can call "ps -p my_process_id -f" and then parse the output. But it is not cool.


python - Retrieve the two highest item from a list containing 100,000 integers

How can retrieve the two highest item from a list containing 100,000 integers without having to sort the entire list first?


c++ - How do I retrieve program output in Python?

I'm not a Perl user, but from this question deduced that it's exceedingly easy to retrieve the standard output of a program executed through a Perl script using something akin to: $version = `java -version`; How would I go about getting the same end result in Python? Does t...






Still can't find your answer? Check out these communities...



PySlackers | Full Stack Python | NHS Python | Pythonist Cafe | Hacker Earth | Discord Python



top