CONTACT US
forrester wave report 2023

Forrester rates Corelight a strong performer

GET THE REPORT

ad-nav-crowdstrike

Corelight now powers CrowdStrike solutions and services

READ MORE

ad-images-nav_0013_IDS

Alerts, meet evidence.

LEARN MORE ABOUT OUR IDS SOLUTION

ad-images-nav_white-paper

5 Ways Corelight Data Helps Investigators Win

READ WHITE PAPER

glossary-icon

10 Considerations for Implementing an XDR Strategy

READ NOW

ad-images-nav_0006_Blog

Don't trust. Verify with evidence

READ BLOG

ad-nav-NDR-for-dummies

NDR for Dummies

GET THE WHITE PAPER

video

The Power of Open-Source Tools for Network Detection and Response

WATCH THE WEBCAST

ad-nav-ESG

The Evolving Role of NDR

DOWNLOAD THE REPORT

ad-images-nav_0006_Blog

Detecting 5 Current APTs without heavy lifting

READ BLOG

g2-medal-best-support-ndr-winter-2024

Network Detection and Response

SUPPORT OVERVIEW

 

Tracking down a glibc regression

We’d just upgraded our glibc package from 2.32 to 2.33, when we noticed some peculiar behavior. Glibc 2.32 had a number of minor security issues and needed to be patched or upgraded. Instead of back-porting the patches to 2.32 we decided to upgrade directly to 3.33. After completing the upgrade and rebooting, a program was failing to start. Reverting glibc to 2.32 got things working again and confirmed that the glibc upgrade was the source of the trouble, but we wanted to understand why things were broken with the newer version.

The first thing I reached for was strace. Strace shows the active system calls that a program is making. Often when a program is hanging, strace will show that it is blocking in a call to open a tcp connection or lock a file. If you want to learn more about strace, Julia Evans has a great series of posts on it.  In this case strace showed that the broken program was not stuck anywhere, but very busy calling pselect6 and recvfrom:

pselect6(4, [3], NULL, NULL, {0, 500000000}, 0) = 1 (in [3], left {0, 499999521})
recvfrom(3, "..."..., 1514, 0, NULL, NULL) = 222
pselect6(4, [3], NULL, NULL, {0, 500000000}, 0) = 1 (in [3], left {0, 499999516})
recvfrom(3, "..."..., 1514, 0, NULL, NULL) = 514
pselect6(4, [3], NULL, NULL, {0, 500000000}, 0) = 1 (in [3], left {0, 499999529})	

When glibc was downgraded, strace showed it was using select instead of pselect6 and would eventually timeout:

select(4, [3], NULL, NULL, {0, 13272})  = 1 (in [3], left {0, 13234})
recvfrom(3, "..."..., 1514, 0, NULL, NULL) = 151
select(4, [3], NULL, NULL, {0, 13234})  = 1 (in [3], left {0, 13232})
recvfrom(3, "..."..., 1514, 0, NULL, NULL) = 151
select(4, [3], NULL, NULL, {0, 13232})  = 1 (in [3], left {0, 12712})
recvfrom(3, "..."..., 1514, 0, NULL, NULL) = 148
select(4, [3], NULL, NULL, {0, 12712})  = 1 (in [3], left {0, 12710})
recvfrom(3, "..."..., 1514, 0, NULL, NULL) = 148
select(4, [3], NULL, NULL, {0, 12710})  = 0 (Timeout)

So at this point we knew:

  • Newer glibc is using pselect6 instead of select
  • The select timeout argument is decreasing and eventually times out
  • The pselect6 timeout is not decreasing and it never times out.

Select is a libc function and syscall for implementing asynchronous IO.  There are more modern alternatives like epoll or io_uring, but for many years it was the standard for handling IO between different sources, or to read from a single source with a timeout.  One of the arguments to select is a timeout.  This is used to tell select how long to wait before returning that nothing was ready.

If you read the man page for select, it has the following to say about the timeout argument:

select() may update the timeout argument to indicate how 
much time was left.  pselect() does not change this argument.

On Linux, select() modifies timeout to reflect the amount 
of time not slept; most other implementations do not do this.  
(POSIX.1-2001 permits either behavior.)  This causes problems
both when Linux code which reads timeout is ported to 
other operating systems, and when code is ported to Linux  
that reuses a struct timeval for multiple select()s in a 
loop without reinitializing it. Consider timeout to be 
undefined after select() returns.

And if you look at the standard, it confirms that the timeout argument may be modified, but isn’t required to be.

Upon successful completion, the select() function may
modify the object pointed to by the timeout argument.

The program in question was relying on the timeout value changing after the call to select. It was running select in a loop and re-using the timeout value to act as an overall deadline for the loop. But why was it broken now? There didn’t appear to be anything in the release notes for glibc to indicate that they had changed this behavior.  A minimal test program was written to call select and output what the resulting timeout was. The test program uses fork to create a parent and child process. The parent calls select with a 4 second timeout, while the child pauses for 2 seconds and then writes something to the parent. We know we have the old behavior if the timeout option changes from 4 to 2.  If the timeout value doesn’t change, we know we are seeing the new behavior.

#include <stdio.h>
#include <sys/time.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <stdlib.h>

int main(void) {
    fd_set rfds;
    struct timeval tv;
    int retval;

    int fd1[2];

    pid_t p;

    if (pipe(fd1)==-1) {
        perror("pipe");
        return 1;
    }

    p = fork();

    if (p < 0) {
        perror("fork");
        return 1;
    }

    // Parent process: Call select on the pipe with a 4 second timeout.
    if (p > 0) {
        FD_ZERO(&rfds);
        FD_SET(fd1[0], &rfds);
        /* Wait up to four seconds. */
        tv.tv_sec = 4;
        tv.tv_usec = 0;
        printf("tv before: %ld.%06ld\n", tv.tv_sec, tv.tv_usec);
        retval = select(fd1[0]+1, &rfds, NULL, NULL, &tv);
        if (retval == -1)
            perror("select()");
        else if (retval)
            printf("Data is available now.\n");
        else
            printf("No data within four seconds.\n");
        printf("tv after : %ld.%06ld\n", tv.tv_sec, tv.tv_usec);
        int broken = tv.tv_sec >= 3;
        printf("select: %s\n", broken ? "BROKEN" : "OK");
        return broken;
    } else { // child process: Pause for 2 seconds, then write something to the parent.
        sleep(2);
        write(fd1[1], "Hello", strlen("Hello"));
        close(fd1[1]);
        exit(0);
    }

}

On 2.32 this outputs:

tv before: 4.000000
Data is available now.
tv after : 2.000002
select: OK

But on 2.33 it outputs:

tv before: 4.000000
Data is available now.
tv after : 4.000000
select: BROKEN

Because the value of the timeout argument is not guaranteed to be updated after the call to select either of these outcomes is valid. This means it’s not truly broken behavior, but the change is still unexpected. The timeout argument is no longer being modified to reflect the amount of time not slept.  Any program expecting this behavior may go into an infinite loop or otherwise malfunction.  At this point we know exactly what changed, but we still don’t know when or why it was changed.

Git has a feature called bisect that can be used to narrow down exactly when this behavior changed.  Given a known good version and a known bad version, git will use binary search across the commit history to find the commit that introduced the problem.  In order for this process to work we need a way to run the test program against an arbitrary version of glibc.  After working out how to do this once manually, the process was turned into a script:

#!/bin/sh

rm -fr /tmp/glibc-build /tmp/lib /tmp/usr || exit 1
mkdir -p /tmp/glibc-build || exit 1
cd /tmp/glibc-build || exit 1

/tmp/src/glibc/configure           \
    --prefix=/tmp/usr              \
    --disable-profile              \
    libc_cv_slibdir=/tmp/lib > /dev/null || exit 125
make -j 8 > /dev/null || exit 125
make install > /dev/null || exit 125


SYSROOT=/tmp/usr
cd ~
gcc \
  -L${SYSROOT}/tmp/lib \
  -I${SYSROOT}/include \
  --sysroot=${SYSROOT} \
  -Wl,-rpath=/tmp/lib/ \
  -Wl,--dynamic-linker=/tmp/lib/ld-linux-x86-64.so.2 \
  -o select_test select_test.c || exit 1
exec ./select_test

This script will first build glibc from /tmp/glibc and install it into /tmp/usr and /tmp/lib. If the build succeeds, it will build and run the select test program. If the build fails, we make sure to exit with the status 125, which tells git bisect to skip this revision.  The required flags for gcc were pieced together from information at https://sourceware.org/glibc/wiki/Testing/Builds.

With this script in place, we can have git bisect run it for us and use it to narrow down the commit that changed the select behavior. First we tell it that 2.33 was “bad” and 2.31 was “good,” then we start the automatic process using the above script:

git checkout glibc-2.33
git bisect bad
git checkout glibc-2.31
git bisect good
git bisect run ~/bisect_glibc.sh

Because git bisect uses binary search it takes a small number of iterations to search through hundreds of commits. After a short wait the following commit was found:

commit 2433d39b69743f100f972e7886f91a2e21795ef0 (refs/bisect/bad)
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Mon Jul 6 16:06:51 2020 -0300

    linux: Add time64 select support
    
    The syscall __NR_pselect6_time64 (32-bit) or __NR_pselect6 (64-bit)
    is used as default.  For architectures with __ASSUME_TIME64_SYSCALLS
    the 32-bit fallback uses __NR_select/__NR__newselect or __NR_pselect6
    (it should cover the microblaze case where older kernels do not
    provide __NR_pselect6).

As part of the testing I ran the script against the latest commit in glibc and that showed the old select behavior again. Was this already fixed? The commit log showed a number of things related to select and timeouts, including bug numbers:

$ git log master --pretty=format:"%h%x09%ad%x09%s"|grep select|head -n 30
cedbf6d5f3      Thu Apr 8 07:39:32 2021 -0300   linux: always update select timeout (BZ #27706)
9d7c5cc38e      Wed Mar 31 13:53:34 2021 -0300  linux: Normalize and return timeout on select (BZ #27651)
49a40ba18e      Fri Apr 9 10:05:13 2021 -0300   libsupport: Add support_select_normalizes_timeout
5628f103f5      Fri Apr 9 10:02:30 2021 -0300   libsupport: Add support_select_modifies_timeout
1b53b5d970      Thu Mar 25 16:57:45 2021 -0300  misc: Fix tst-select timeout handling (BZ#27648)
7759be2593      Tue Mar 23 17:29:26 2021 +0100  S390: Also check vector support in memmove ifunc-selector [BZ #27511]
bff3019afc      Sat Mar 13 23:34:21 2021 +0100  tst: Provide test for select

Git bisect can be used to find the commit that restored the old behavior as well, the logic just needs to be flipped:

git checkout glibc-2.33
git bisect old
git checkout master
git bisect new
git bisect run ~/bisect_glibc_backwards.sh

Where the backwards version swaps out:

exec ./select_test

For:

exit ! ./select_test

That bisection results in confirming that the recent commits from only a few weeks ago are what restored the previous behavior:


commit 9d7c5cc38e58fb0923e88901f87174a511b61552 (refs/bisect/new)
Author: Adhemerval Zanella <adhemerval.zanella@linaro.org>
Date:   Wed Mar 31 13:53:34 2021 -0300

    linux: Normalize and return timeout on select (BZ #27651)
    
    The commit 2433d39b697, which added time64 support to select, changed
    the function to use __NR_pselect6 (or __NR_pelect6_time64) on all
    architectures.  However, on architectures where the symbol was
    implemented with __NR_select the kernel normalizes the passed timeout
    instead of return EINVAL.  For instance, the input timeval
    { 0, 5000000 } is interpreted as { 5, 0 }.
    
    And as indicated by BZ #27651, this semantic seems to be expected
    and changing it results in some performance issues (most likely
    the program does not check the return code and keeps issuing
    select with unormalized tv_usec argument).

By using a little Dockerfile we can quickly test various other distributions to see which ones might have had their behavior changed recently.  To test fedora 34, we can use the following Dockerfile (other distributions are similar):

FROM fedora:34
RUN /lib64/lib.so.6 | grep version #running libc outputs the version 
RUN dnf -y install gcc
ADD ./select_test.c .
RUN gcc -o select_test select_test.c
RUN ./select_test

This results in the following:

Distro libc Timeout value
Fedora 33 Glibc 2.32 Time remaining
Fedora 34 Glibc 2.33 Unchanged
Ubuntu 20.04 Glibc 2.31 Time remaining
Ubuntu 20.10 Glibc 2.32 Time remaining
Ubuntu 21.04 Glibc 2.33 Unchanged
Debian unstable Glibc 2.31 Time remaining
Archlinux latest Glibc 2.33 Unchanged
Alpine musl Unchanged
OS X   Unchanged

Even though the old behavior will soon be restored, linux distributions are shipping glibc 2.33. The latest versions of Fedora, Ubuntu, and Arch are using Glibc 2.33 and show the different select behavior. Platforms that are not based on glibc also do not modify the timeout argument.

This issue shows that relying on behavior that isn’t explicitly guaranteed will eventually break things in ways that might not immediately be obvious. Ultimately any programs that have this issue need to be patched to follow the standard and not rely on the implementation-defined behavior of the timeout argument. 

 

Recent Posts