Program Tip

문자열 목록에서 빈 문자열 제거

programtip 2020. 10. 2. 23:06
반응형

문자열 목록에서 빈 문자열 제거


파이썬의 문자열 목록에서 모든 빈 문자열을 제거하고 싶습니다.

내 아이디어는 다음과 같습니다.

while '' in str_list:
    str_list.remove('')

이것을 수행하는 더 파이썬적인 방법이 있습니까?


나는 사용할 것이다 filter:

str_list = filter(None, str_list) # fastest
str_list = filter(bool, str_list) # fastest
str_list = filter(len, str_list)  # a bit slower
str_list = filter(lambda item: item, str_list) # slower than list comprehension

Python 3은에서 반복자를 반환 filter하므로에 대한 호출로 래핑되어야합니다.list()

str_list = list(filter(None, str_list)) # fastest

( )

테스트 :

>>> timeit('filter(None, str_list)', 'str_list=["a"]*1000', number=100000)
2.4797441959381104
>>> timeit('filter(bool, str_list)', 'str_list=["a"]*1000', number=100000)
2.4788150787353516
>>> timeit('filter(len, str_list)', 'str_list=["a"]*1000', number=100000)
5.2126238346099854
>>> timeit('[x for x in str_list if x]', 'str_list=["a"]*1000', number=100000)
13.354584932327271
>>> timeit('filter(lambda item: item, str_list)', 'str_list=["a"]*1000', number=100000)
17.427681922912598

이해력 나열

strings = ["first", "", "second"]
[x for x in strings if x]

산출: ['first', 'second']


필터에는 실제로 이에 대한 특별한 옵션이 있습니다.

filter(None, sequence)

False로 평가되는 모든 요소를 ​​필터링합니다. bool, len 등과 같은 실제 콜 러블을 사용할 필요가 없습니다.

map (bool, ...)만큼 빠릅니다.


>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']

>>> ' '.join(lstr).split()
['hello', 'world']

>>> filter(None, lstr)
['hello', ' ', 'world', ' ']

시간 비교

>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
4.226747989654541
>>> timeit('filter(None, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.0278358459472656

공지 사항 filter(None, lstr)공백으로 빈 문자열을 제거하지 않고는 ' ', 단지 멀리 프 i (prune) ''동안 ' '.join(lstr).split()제거합니다 모두.

filter()공백 문자열을 제거하여 사용하려면 훨씬 더 많은 시간이 걸립니다.

>>> timeit('filter(None, [l.replace(" ", "") for l in lstr])', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
18.101892948150635

Reply from @Ib33X is awesome. If you want to remove every empty string, after stripped. you need to use the strip method too. Otherwise, it will return the empty string too if it has white spaces. Like, " " will be valid too for that answer. So, can be achieved by.

strings = ["first", "", "second ", " "]
[x.strip() for x in strings if x.strip()]

The answer for this will be ["first", "second"].
If you want to use filter method instead, you can do like
list(filter(lambda item: item.strip(), strings)). This is give the same result.


Instead of if x, I would use if X != '' in order to just eliminate empty strings. Like this:

str_list = [x for x in str_list if x != '']

This will preserve None data type within your list. Also, in case your list has integers and 0 is one among them, it will also be preserved.

For example,

str_list = [None, '', 0, "Hi", '', "Hello"]
[x for x in str_list if x != '']
[None, 0, "Hi", "Hello"]

Depending on the size of your list, it may be most efficient if you use list.remove() rather than create a new list:

l = ["1", "", "3", ""]

while True:
  try:
    l.remove("")
  except ValueError:
    break

This has the advantage of not creating a new list, but the disadvantage of having to search from the beginning each time, although unlike using while '' in l as proposed above, it only requires searching once per occurrence of '' (there is certainly a way to keep the best of both methods, but it is more complicated).


Use filter:

newlist=filter(lambda x: len(x)>0, oldlist) 

The drawbacks of using filter as pointed out is that it is slower than alternatives; also, lambda is usually costly.

Or you can go for the simplest and the most iterative of all:

# I am assuming listtext is the original list containing (possibly) empty items
for item in listtext:
    if item:
        newlist.append(str(item))
# You can remove str() based on the content of your original list

this is the most intuitive of the methods and does it in decent time.


Keep in mind that if you want to keep the white spaces within a string, you may remove them unintentionally using some approaches. If you have this list

['hello world', ' ', '', 'hello'] what you may want ['hello world','hello']

first trim the list to convert any type of white space to empty string:

space_to_empty = [x.strip() for x in _text_list]

then remove empty string from them list

space_clean_list = [x for x in space_to_empty if x]

As reported by Aziz Alto filter(None, lstr) does not remove empty strings with a space ' ' but if you are sure lstr contains only string you can use filter(str.strip, lstr)

>>> lstr = ['hello', '', ' ', 'world', ' ']
>>> lstr
['hello', '', ' ', 'world', ' ']
>>> ' '.join(lstr).split()
['hello', 'world']
>>> filter(str.strip, lstr)
['hello', 'world']

Compare time on my pc

>>> from timeit import timeit
>>> timeit('" ".join(lstr).split()', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
3.356455087661743
>>> timeit('filter(str.strip, lstr)', "lstr=['hello', '', ' ', 'world', ' ']", number=10000000)
5.276503801345825

The fastest solution to remove '' and empty strings with a space ' ' remains ' '.join(lstr).split().

As reported in a comment the situation is different if your strings contain spaces.

>>> lstr = ['hello', '', ' ', 'world', '    ', 'see you']
>>> lstr
['hello', '', ' ', 'world', '    ', 'see you']
>>> ' '.join(lstr).split()
['hello', 'world', 'see', 'you']
>>> filter(str.strip, lstr)
['hello', 'world', 'see you']

You can see that filter(str.strip, lstr) preserve strings with spaces on it but ' '.join(lstr).split() will split this strings.


To eliminate empties after stripping:

slist = map(lambda s: s and s.strip(), slist)
slist = filter(None, slist)

Some PROs:

  • lazy, based on generators, to save memory;
  • decent understandability of the code;
  • fast, selectively using builtins and comprehensions.

    def f1(slist):
        slist = [s and s.strip() for s in slist]
        return list(filter(None, slist))
    
    def f2(slist):
        slist = [s and s.strip() for s in slist]
        return [s for s in slist if s]
    
    
    def f3(slist):
        slist = map(lambda s: s and s.strip(), slist)
        return list(filter(None, slist))
    
    def f4(slist):
        slist = map(lambda s: s and s.strip(), slist)
        return [s for s in slist if s]
    
    %timeit f1(words)
    10000 loops, best of 3: 106 µs per loop
    
    %timeit f2(words)
    10000 loops, best of 3: 126 µs per loop
    
    %timeit f3(words)
    10000 loops, best of 3: 165 µs per loop
    
    %timeit f4(words)
    10000 loops, best of 3: 169 µs per loop
    

For a list with a combination of spaces and empty values, use simple list comprehension -

>>> s = ['I', 'am', 'a', '', 'great', ' ', '', '  ', 'person', '!!', 'Do', 'you', 'think', 'its', 'a', '', 'a', '', 'joke', '', ' ', '', '?', '', '', '', '?']

So, you can see, this list has a combination of spaces and null elements. Using the snippet -

>>> d = [x for x in s if x.strip()]
>>> d
>>> d = ['I', 'am', 'a', 'great', 'person', '!!', 'Do', 'you', 'think', 'its', 'a', 'a', 'joke', '?', '?']

str_list = ['2', '', '2', '', '2', '', '2', '', '2', '']

for item in str_list:
    if len(item) < 1:  
        str_list.remove(item)

Short and sweet.


Loop through the existing string list and then check for a empty string, if it's not empty populate a new string list with the non-empty values and then replace the old string list with the new string list


filter(None, str) does not remove empty strings with a space ' ', it only prunes away '' and ' '.

join(str).split() removes both. but if your element of list having space then it will change your list elements also because it's joining first your all elements of list then spiting them by space so You should use : -

str = ['hello', '', ' ', 'world', ' ']
print filter(lambda x:x != '', filter(lambda x:x != ' ', str))

It will remove both and won't effect your elements also Like :-

str = ['hello', '', ' ', 'world ram', ' ']
print  ' '.join(lstr).split()
print filter(lambda x:x != '', filter(lambda x:x != ' ', lstr))

output:-

['hello', 'world', 'ram'] <-------------- output of ' '.join(lstr).split()
['hello', 'world ram']

참고URL : https://stackoverflow.com/questions/3845423/remove-empty-strings-from-a-list-of-strings

반응형