Reported on the mailinglist:
"
I discovered recently that if an application running inside st tries to
send a DCS string, subsequent Unicode characters get messed up. For
example, consider the following test-case:
printf '\303\277\033P\033\\\303\277'
...where:
- \303\277 is the UTF-8 encoding of U+00FF LATIN SMALL LETTER Y WITH
DIAERESIS (ÿ).
- \033P is ESC P, the token that begins a DCS string.
- \033\\ is ESC \, a token that ends a DCS string.
- \303\277 is the same ÿ character again.
If I run the above command in a VTE-based terminal, or xterm, or
QTerminal, or pterm (PuTTY), I get the output:
ÿÿ
...which is to say, the empty DCS string is ignored. However, if I run
that command inside st (as of commit 9ba7ecf), I get:
ÿÿ
...where those last two characters are \303\277 interpreted as ISO8859-1
characters, instead of UTF-8.
I spent some time tracing through the state machines in st.c, and so far
as I can tell, this is how it works currently:
- ESC P sets the "ESC_DCS" and "ESC_STR" flags, indicating that
incoming bytes should be collected into the strescseq buffer, rather
than being interpreted.
- ESC \ sets the "ESC_STR_END" flag (when ESC is received), and then
calls strhandle() (when \ is received) to interpret the collected
bytes.
- If the collected bytes begin with 'P' (i.e. if this was a DCS
string) strhandle() sets the "ESC_DCS" flag again, confusing the
state machine.
If my understanding is correct, fixing the problem should be as easy as
removing the line that sets ESC_DCS from strhandle():
diff --git a/st.c b/st.c
index ef8abd5..b5b805a 100644
--- a/st.c
+++ b/st.c
@@ -1897,7 +1897,6 @@ strhandle(void)
xsettitle(strescseq.args[0]);
return;
case 'P': /* DCS -- Device Control String */
- term.mode |= ESC_DCS;
case '_': /* APC -- Application Program Command */
case '^': /* PM -- Privacy Message */
return;
I've tried the above patch and it fixes my problem, but I don't know if
it introduces any others.
"
Similar to the xterm AllowWindowOps option, this is an option to allow or
disallow certain (non-interactive) operations that can be insecure or
exploited.
NOTE: xsettitle() is not guarded by this because st does not support printing
the window title. Else this could be exploitable (arbitrary code execution).
Similar problems have been found in the past in other terminal emulators.
The sequence for base64-encoded clipboard copy is now guarded because it allows
a sequence written to the terminal to manipulate the clipboard of the running
user non-interactively, for example:
printf '\x1b]52;0;ZWNobyBoaQ0=\a'
Add the functionality back in for xterm compatibility, but do not expose the
capability in st.info (yet).
Some notes:
It was reverted because it caused some issues with ncurses in some
configurations, namely when using BSD padding (--enable-bsdpad, BSD_TPUTS) in
ncurses it caused issues with repeating digits.
A fix has been upstreamed in ncurses since snapshot 20200523. The fix is also
backported to OpenBSD -current.
This reverts commit e8392b282c2eaa28725241a9612804fb55113da4.
There is currently a bug in older ncurses versions (like on OpenBSD) where a
fix for a bug with REP is not backported yet. Most likely in tty/tty_update.c:
Noticed while using lynx (which uses ncurses/curses).
To reproduce using lynx: echo "Z0000000" | lynx -stdin
or using the program:
int
main(void)
{
WINDOW *win;
win = initscr();
printw("Z0000000");
refresh();
sleep(5);
return 0;
}
This prints "ZZZZZZZ" (incorrectly).
The sequence \e[Nb prints the last printed char N (more) times if it's
printable, and it's ignored after newline or other control chars.
This is Ecma-048/ANSI-X3.6 sequence and not DEC VT. It's supported by
xterm, and ncurses uses it when possible, e.g. when TERM is xterm* (and
with this commit also st*).
xterm supports only codepoints<=255, possibly due to internal limits.
We support any value/codepoint which was placed in a cell.
To test:
- tput rep 65 4 -> prints 'AAAA'
- printf "\342\225\246\033[4b" -> prints U+2566 1+4 times.
St uses a very good hack where mouse wheel genereates ^Y and ^E,
that are the same keys that less and vi uses for backward and
fordward scrolling. Scroll, as many terminal emulators, use
shift+Prev/Next for scrolling, but it is also using ^E and ^Y
for scroling, characters that are reserved in the POSIX shell
in emacs mode for end of line and yanking, making scroll unsable
in st.
This patch adds a new hack, making shift+wheel returning the
same sequences than shift+Prev/Next, meaning that scroll or
any other similar program will not be able to differentiate
between them.
Fix an issue with incorrect (partial) written sequences when libc wcwidth() ==
-1. The sequence is updated to on wcwidth(u) == -1:
c = "\357\277\275"
but len isn't.
A way to reproduce in practise:
* st -o dump.txt
* In the terminal: printf '\xcd\xb8'
- This is codepoint 888, on OpenBSD it reports wcwidth() == -1.
- Quit the terminal.
- Look in dump.txt (partial written sequence of "UTF_INVALID").
This was introduced in:
" commit 11625c7166b7e4dad414606227acec2de1c36464
Author: czarkoff@gmail.com <czarkoff@gmail.com>
Date: Tue Oct 28 12:55:28 2014 +0100
Replace character with U+FFFD if wcwidth() is -1
Helpful when new Unicode codepoints are not recognized by libc."
Change:
Remove setting the sequence. If this happens to break something, another
solution could be setting len = 3 for the sequence.
st could easily tear/flicker with animation or other unattended
output. This commit eliminates most of the tear/flicker.
Before this commit, the display timing had two "modes":
- Interactively, st was waiting fixed `1000/xfps` ms after forwarding
the kb/mouse event to the application and before drawing.
- Unattended, and specifically with animations, the draw frequency was
throttled to `actionfps`. Animation at a higher rate would throttle
and likely tear, and at lower rates it was tearing big frames
(specifically, when one `read` didn't get a full "frame").
The interactive behavior was decent, but it was impossible to get good
unattended-draw behavior even with carefully chosen configuration.
This commit changes the behavior such that it draws on idle instead of
using fixed latency/frequency. This means that it tries to draw only
when it's very likely that the application has completed its output
(or after some duration without idle), so it mostly succeeds to avoid
tear, flicker, and partial drawing.
The config values minlatency/maxlatency replace xfps/actionfps and
define the range which the algorithm is allowed to wait from the
initial draw-trigger until the actual draw. The range enables the
flexibility to choose when to draw - when least likely to flicker.
It also unifies the interactive and unattended behavior and config
values, which makes the code simpler as well - without sacrificing
latency during interactive use, because typically interactively idle
arrives very quickly, so the wait is typically minlatency.
While it only slighly improves interactive behavior, for animations
and other unattended-drawing it improves greatly, as it effectively
adapts to any [animation] output rate without tearing, throttling,
redundant drawing, or unnecessary delays (sounds impossible, but it
works).
St used to use backspace as BS until the commit 230d0c8, but due
to general lack of knowledge of lusers, we moved to the most common
configuration in linux to avoid answering the same question 3 times
per month. With the most common configuration we have a backspace
that returns a DEL, and we have a Delete key that doesn't return a
DEL character neither a BS.
When dealing with devices connected using a serial line (or even
with Plan9) it is more common Backspace as BS and Delete as DEL. For
this reason, st is not always the best tool when you talk with a
serial device.
This patch adds new terminfo entries for Backspace as BS and Delete
as DEL. A patch for confg.h is also added, to make easier switch
between both configurations.
When a read operation returns 0 then it means that we arrived to the end of the
file, and new reads will return 0 unless you do some other operation such as
lseek(). This case happens with USB-232 adapters when they are unplugged.
Scroll is a program that stores all the lines of its child and be used in st as
a way of implementing scrollback.
This solution is much better than implementing the scrollback in st itself
because having a different program allows to use it in any other program
without doing modifications to those programs.
This line didn't work at mshortcuts at config.h:
/* mask button function arg release */
{ ShiftMask, Button2, selpaste, {.i = 0}, 1 },
and now it does work.
The issue was that XButtonEvent.state is "the logical state ... just prior
to the event", which means that on release the state has the Button2Mask
bit set because button2 was down just before it was released.
The issue didn't manifest with the default shift + middle-click on release
(to override mouse mode) because its specified modifier is XK_ANY_MOD, at
which case match(...) ignores any specific bits and simply returns true.
The issue also doesn't manifest on press, because prior to the event
Button<N> was not down and its mask bit is not set.
Fix by filtering out the mask of the button which we're currently matching.
We could have said "well, that's how button events behave, you should
use ShiftMask|Button2Mask for release", but this both not obvious to
figure out, and specifically here always filtering does not prevent
configuring any useful modifiers combination. So it's a win-win.
XCreateIC ICValues default XNFocusWindow to XNClientWindow if not
specified, it can be omitted since it is the same.
From the documentation
https://www.x.org/releases/current/doc/libX11/libX11/libX11.html
> Focus Window
>
> The XNFocusWindow argument specifies the focus window. The primary
> purpose of the XNFocusWindow is to identify the window that will receive
> the key event when input is composed.
>
> When this XIC value is left unspecified, the input method will use the
> client window as the default focus window.
Do not try to set specific IM method, let the user specify it with
XMODIFIERS.
If the requested method is not available or opening fails, fallback to
the default input handler and register a handler on the new IM server
availability signal.
Do the same when the input server is closed and (re)started.
Current buffer is too short to input medium to long sentences from IME.
Input with longer text will show the wrong input, taking 64 instead of
32 bytes should be enough for most of the cases. Broken cases before,
Chinese (taken from song 也可以)
可不可以轻轻的松开自己
Japanese (taken from bootleggers rom quote)
あなたは家のように感じる
Strings which an application sends to the terminal in OSC, DCS, etc
are typically small (title, colors, etc) but one exception is OSC 52
which copies text to the clipboard, and is used for instance by tmux.
Previously st cropped these strings at 512 bytes, which for OSC 52
limited the copied text to 382 bytes (remaining buffer space before
base64). This made it less useful than it can be.
Now it's a dynamic growing buffer. It remains allocated after use,
resets to 512 when a new string starts, or leaked on exit.
Resetting/deallocating the buffer right after use (at strhandle) is
possible with some more code, however, it doesn't always end up used,
and to cover those cases too will require even more code, so resetting
only on new string is good enough for now.
STRescape holds strings in escape sequences such as OSC and DCS, and
its buffer is 512 bytes.
If the input is too big then trailing chars are ignored, but the test
was off-by-1 such that it took 510 chars instead of 511 (before a
terminating NULL is added).
Now the full size can be utilized.
Previously, base64dec checked terminating input '\0' every 4 calls to
base64dec_getc, where the latter progressed one or more chars on each
call, and could read past '\0' in the way it was used.
The input to base64dec currently comes only from OSC 52 escape seq
(copy to clipboard), and reading past '\0' or even past the buffer
boundary was easy to trigger.
Also, even if we could trust external input to be valid base64, there
are different base64 standards, and not all of them require padding
to 4 bytes blocks (using trailing '=' chars).
It didn't affect short OSC 52 strings because the buffer is initialized
to 0's, so typically it did stop within the buffer, but if the string
was trimmed to fit (the buffer is 512 bytes) then it did also read past
the end of the buffer, and the decoded suffix ended up arbitrary.
This patch makes base64dec_getc not progress past '\0', and instead
produce fake trailing padding of '='.
Additionally, at base64dec, if padding is detected at the first or
second byte of a quartet, then we identify it as invalid and abort
(a valid quartet has at least two leading non-padding bytes).
For WM_CLASS this is mentioned in the ICCCM docs
https://tronche.com/gui/x/icccm/sec-4.html#s-4.1.2.5
(third sentence).
When changing the WM_CLASS from the command line, this is necessary for
window managers to pick it up before applying class-based rules.
The recent mouse shurtcuts commits allow customization, but ignore
forcemousemod mask (default: shift) as a modifier, for no good reason
other than following the behavior of the KB shortcuts.
Allow using forcemousemod too, which now can be used to invoke
different shortcuts, though the automatic effect of forcemousemod will
be lost for buttons which use mask with forcemousemod.
E.g. the default is:
static uint forcemousemod = ShiftMask;
...
{ XK_ANY_MOD, Button4, ttysend, {.s = "\031"} },
...
where ttysend will be invoked for button4 with any mod when not in mouse
mode, and with shift when in mouse mode.
Now it's possible to do this:
{ ShiftMask, Button4, ttysend, {.s = "foo"} },
{ XK_ANY_MOD, Button4, ttysend, {.s = "\031"} },
Which will invoke ttysend("foo") while shift is held and ttysend("\031")
otherwise. Shift still overrides mouse mode, but will now send "foo".
Previously with this setup the second binding was always invoked
because the forceousemod mask was always removed from the event.
Buttons which don't use forcemousemod behave the same as before.
This is useful e.g. for the scrollback mouse patch, which wants to
configure shift+wheel for scrollback, while keeping the normal behavior
without shift.