Python의 파일에서 한 번에 한 문자를 읽는 방법은 무엇입니까?

Program Tip

Python의 파일에서 한 번에 한 문자를 읽는 방법은 무엇입니까?

programtip 2020. 11. 13. 23:56

Python의 파일에서 한 번에 한 문자를 읽는 방법은 무엇입니까?

누구든지 내가 어떻게 할 수 있는지 말해 줄 수 있습니까?

with open(filename) as f:
  while True:
    c = f.read(1)
    if not c:
      print "End of file"
      break
    print "Read a character:", c

먼저 파일을 엽니 다.

with open("filename") as fileobj:
    for line in fileobj:  
       for ch in line: 
           print ch

나는 받아 들여진 대답을 좋아한다 : 그것은 간단하고 일을 끝낼 것이다. 또한 대체 구현을 제공하고 싶습니다.

def chunks(filename, buffer_size=4096):
    """Reads `filename` in chunks of `buffer_size` bytes and yields each chunk
    until no more characters can be read; the last chunk will most likely have
    less than `buffer_size` bytes.

    :param str filename: Path to the file
    :param int buffer_size: Buffer size, in bytes (default is 4096)
    :return: Yields chunks of `buffer_size` size until exhausting the file
    :rtype: str

    """
    with open(filename, "rb") as fp:
        chunk = fp.read(buffer_size)
        while chunk:
            yield chunk
            chunk = fp.read(buffer_size)

def chars(filename, buffersize=4096):
    """Yields the contents of file `filename` character-by-character. Warning:
    will only work for encodings where one character is encoded as one byte.

    :param str filename: Path to the file
    :param int buffer_size: Buffer size for the underlying chunks,
    in bytes (default is 4096)
    :return: Yields the contents of `filename` character-by-character.
    :rtype: char

    """
    for chunk in chunks(filename, buffersize):
        for char in chunk:
            yield char

def main(buffersize, filenames):
    """Reads several files character by character and redirects their contents
    to `/dev/null`.

    """
    for filename in filenames:
        with open("/dev/null", "wb") as fp:
            for char in chars(filename, buffersize):
                fp.write(char)

if __name__ == "__main__":
    # Try reading several files varying the buffer size
    import sys
    buffersize = int(sys.argv[1])
    filenames  = sys.argv[2:]
    sys.exit(main(buffersize, filenames))

내가 제안하는 코드는 기본적으로 허용되는 답변과 동일한 아이디어입니다. 파일에서 주어진 바이트 수를 읽습니다. 차이점은 먼저 좋은 데이터 청크를 읽은 다음 (4006은 X86의 좋은 기본값이지만 1024 또는 8192, 페이지 크기의 배수를 시도 할 수 있음) 해당 청크의 문자를 산출한다는 것입니다. 하나씩.

내가 제시하는 코드는 더 큰 파일의 경우 더 빠를 수 있습니다. 예를 들어 톨스토이가 쓴 전쟁과 평화의 전체 텍스트를 보자 . 다음은 내 타이밍 결과입니다 (OS X 10.7.4를 사용하는 Mac Book Pro; so.py는 내가 붙여 넣은 코드에 지정한 이름입니다).

$ time python so.py 1 2600.txt.utf-8
python so.py 1 2600.txt.utf-8  3.79s user 0.01s system 99% cpu 3.808 total
$ time python so.py 4096 2600.txt.utf-8
python so.py 4096 2600.txt.utf-8  1.31s user 0.01s system 99% cpu 1.318 total

Now: do not take the buffer size at 4096 as a universal truth; look at the results I get for different sizes (buffer size (bytes) vs wall time (sec)):

As you can see, you can start seeing gains earlier on (and my timings are likely very inaccurate); the buffer size is a trade-off between performance and memory. The default of 4096 is just a reasonable choice but, as always, measure first.

Python itself can help you with this, in interactive mode:

>>> help(file.read)
Help on method_descriptor:

read(...)
    read([size]) -> read at most size bytes, returned as a string.

    If the size argument is negative or omitted, read until EOF is reached.
    Notice that when in non-blocking mode, less data than what was requested
    may be returned, even if no size parameter was given.

Just:

myfile = open(filename)
onecaracter = myfile.read(1)

I learned a new idiom for this today while watching Raymond Hettinger's Transforming Code into Beautiful, Idiomatic Python:

import functools

with open(filename) as f:
    f_read_ch = functools.partial(f.read, 1)
    for ch in iter(f_read_ch, ''):
        print 'Read a character:', repr(ch)

Just read a single character

f.read(1)

You should try f.read(1), which is definitely correct and the right thing to do.

This will also work:

with open("filename") as fileObj:
    for line in fileObj:  
        for ch in line:
            print(ch)

It goes through every line in the the file and every character in every line.

#reading out the file at once in a list and then printing one-by-one
f=open('file.txt')
for i in list(f.read()):
    print(i)

f = open('hi.txt', 'w')
f.write('0123456789abcdef')
f.close()
f = open('hej.txt', 'r')
f.seek(12)
print f.read(1) # This will read just "c"

To make a supplement, if you are reading file that contains a line that is vvvvery huge, which might break your memory, you might consider read them into a buffer then yield the each char

def read_char(inputfile, buffersize=10240):
    with open(inputfile, 'r') as f:
        while True:
            buf = f.read(buffersize)
            if not buf:
                break
            for char in buf:
                yield char
        yield '' #handle the scene that the file is empty

if __name__ == "__main__":
    for word in read_char('./very_large_file.txt'):
        process(char)

참고URL : https://stackoverflow.com/questions/2988211/how-to-read-a-single-character-at-a-time-from-a-file-in-python

'Program Tip' 카테고리의 다른 글

Hibernate, iBatis, Java EE 또는 기타 Java ORM 도구 (0)	2020.11.13
foreach on Request.Files (0)	2020.11.13
IF a == true OR b == true 문 (0)	2020.11.13
MySQL에서`REPLACE`와`INSERT… ON DUPLICATE KEY UPDATE`의 실질적인 차이점은 무엇입니까? (0)	2020.11.13
OpenCV에서 Watershed의 마커를 정의하는 방법은 무엇입니까? (0)	2020.11.13

현재글Python의 파일에서 한 번에 한 문자를 읽는 방법은 무엇입니까?

programtip

Python의 파일에서 한 번에 한 문자를 읽는 방법은 무엇입니까?

Python의 파일에서 한 번에 한 문자를 읽는 방법은 무엇입니까?

'Program Tip' 카테고리의 다른 글

'Program Tip'의 다른글

티스토리툴바

Python의 파일에서 한 번에 한 문자를 읽는 방법은 무엇입니까?

Python의 파일에서 한 번에 한 문자를 읽는 방법은 무엇입니까?

'Program Tip' 카테고리의 다른 글

'Program Tip'의 다른글

관련글

티스토리툴바