N
N
neon2011-09-02 00:48:20
Python
neon, 2011-09-02 00:48:20

Python - reading from stdin

I needed to read and decode everything that comes from stdin byte by byte. Regular characters, UTF8, special buttons. There are no problems with single-byte characters, but it’s impossible to agree with the rest in a human way.


Started with this:

#!/usr/bin/python
import select
import sys
import tty

poller = select.poll()
poller.register( sys.stdin, select.POLLIN )

tty.setcbreak( sys.stdin )

while True:
        events = poller.poll( 500 )
        if events:
                char = sys.stdin.read( 1 )
                print ord( char )


I start, I press abcd - everything is fine:
97 98 99 100

I press Page up, and instead of
27 91 53 126
I have only the first byte - 27. The poll does not work on the remaining 3 bytes until the next key is pressed.
I press, for example, Page up, then enter, it turns out like this:

27 <nothing> 91 53 126 10

That is, after pressing the second key, poll works 4 times: 3 times for the bytes remaining from Page up and once for enter.

So far I've come up with 2 solutions. Very bad and dubious.

Very bad decision

Hang on stdin non-block and read to yourself, happily devouring the processor.
import sys
import tty
import fcntl
import os
fd = sys.stdin.fileno()
fl = fcntl.fcntl( fd, fcntl.F_GETFL )
fcntl.fcntl( fd, fcntl.F_SETFL, fl | os.O_NONBLOCK )
tty.setcbreak( sys.stdin )
while True:
        try:
                c = sys.stdin.read( 1 )
                print ord( c )
        except:
                pass


Dubious decision

Immediately after receiving the first character, start decoding utf using this plate, if necessary, read more characters.
Doubts, in general, arise in two points:
  1. And suddenly, there will be no next character and sys.std.read will block until the next character? I haven't figured out why this could happen yet, but it seems quite real. I really don’t want to hang non-block just for the sake of it.
  2. It is necessary to additionally process the first character, if it is equal to 27, in order to read all sorts of special characters. I did not find any convenient label like the one for UTF.


All I need is to split the input byte stream into multibyte characters.

Answer the question

In order to leave comments, you need to log in

3 answer(s)
G
Gribozavr, 2011-09-02
@gribozavr

You need to change the file descriptor to non-blocking mode:

#!/usr/bin/python
import select
import sys
import os
import tty
import fcntl

poller = select.poll()
poller.register(sys.stdin, select.POLLIN)

tty.setcbreak(sys.stdin)

fcntl.fcntl(sys.stdin.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)

while True:
        events = poller.poll(500)
        if events:
                for char in sys.stdin.read():
                        print ord(char)

E
enchantner, 2011-09-02
@enchantner

And if you try through codecs? Something like this:

import codecs
import locale
import sys

locale.setlocale(locale.LC_ALL, '')

lang, encoding = locale.getdefaultlocale()
sys.stdin = codecs.getreader(encoding)(sys.stdin)

print 'From stdin:', repr(sys.stdin.read())

You can read it here: www.doughellmann.com/PyMOTW/codecs/index.html

K
Kindman, 2011-09-05
@Kindman

You can switch to non-blocking mode and pause for 30 milliseconds inside the polling cycle.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question