Files
2025-11-30 21:19:04 +09:00

6.2 KiB

System-Level I/O

IO is the process of coping data between the main memory and external devices.

In a Linux, file is a sequence of m bytes.

All I/O devices are represented as files. Even the kernel is represented as a file.

Unix IO

  • open and close
  • read and write
  • lseek changing current file position

File Types

  • Regular files
  • Directory
  • Socket
  • ...

Regular Files

A regular file contains arbitary data.

For example text file is a sequence of text lines. EOL is different in different OS: (\n in Unix, \r\n in Windows & Internet).

Directories

Directory contains an array of links. Least two links are .(itself) and ..(parent dir).

  • ls
  • mkdir
  • rmdir

All files are orgnaized as a hierarchy anchored by root dir named /.

Kernel maintains curr working dir (cwd) for each process that modified using the cd command.

Path names

  • Absolute /home/yenru0/workspace
  • Relative ../workspace

Open & Close & Read & Write

int fd;

if ((fd = open("file.txt", O_RDONLY)) < 0) {
    perror("open");
    exit(1);
}
  • open returns a non-negative integer called file descriptor (fd).
    • fd == -1 indicates an error.
    • 0: stdin, 1: stdout, 2: stderr
int fd; int ret;
if ((ret = close(fd)) < 0) {
    perror("close");
    exit(1);
}

Closing an already closed can lead to a disastrous situation in threaded programs. So always check the return code.

char buf[512];

nbytes = read(fd, buf, sizeof(buf));

ssize_t read(int fd, void *usrbuf, size_t n);

read returns the number of bytes read from the fd into buf. ssize_t is signed version of size_t.

If read returns negative value, an error occurred.

ssize_t write(int fd, const void *usrbuf, size_t n);

If write returns negative value, an error occurred.

Short Counts

It means that read or write transfers fewer bytes than requested. It can occur in these situations:

  • EOF on reads
  • Reading text lines from an terminal
  • Reading from a network socket

Never occurs:

  • Reading from disk files (except for EOF)
  • Writing to disk files

RIO pakcage

RIO is a set of wrappers efficient and robust I/O functions subject to short couunts.

  • unbuffered RIO functions rio_readn, rio_writen
  • buffered RIO functions rio_readnb, rio_readlineb
    • buffered RIO functions are thread-safe and can be interleaved arbitrarily on the same descriptor.

Buffered RIO

To read efficiently from a file, RIO uses partially cached in an interal memory buffer. (rio_t structure)

For reading from file, Buffer has buffered portion of already read and unread data. It is refilled automatically by rio_readnb and rio_readlineb as needed. This is partially cached.

typedef struct {
    int rio_fd;                // Descriptor for this internal buf
    int rio_cnt;               // Unread bytes in internal buf
    char *rio_bufptr;          // Next unread byte in internal buf
    char rio_buf[RIO_BUFSIZE]; // Internal buffer
} rio_t;

example:

int main(int argc, char **argv) {
    int n; rio_t rio; char buf[MAXLINE];
    rio_readinitb(&rio, STDIN_FILENO);
    while ((n = rio_readlineb(&rio, buf, MAXLINE)) != 0) {
        rio_writen(STDOUT_FILENO, buf, n);
    }
    exit(0);
}

Metadata

Metadata is data about data. (file access, file size, file type)

  • Per-process metadata
    • when a process opens a file, the kernel creates an entry in a per-process table called the file descriptor table
  • Per-file metadata
    • can be accessed using stat system call
struct stat {
    dev_t     st_dev;     // ID of device containing file
    ino_t     st_ino;     // inode number
    mode_t    st_mode;    // protection
    nlink_t   st_nlink;   // number of hard links
    uid_t     st_uid;     // user ID of owner
    gid_t     st_gid;     // group ID of owner
    dev_t     st_rdev;    // device ID (if special file)
    off_t     st_size;    // total size, in bytes
    blksize_t st_blksize; // blocksize for filesystem I/O
    blkcnt_t  st_blocks;  // number of 512B blocks allocated
    time_t    st_atime;   // time of last access
    time_t    st_mtime;   // time of last modification
    time_t    st_ctime;   // time of last status change
};

How to Kernel represents Open Files

  • Descriptor table(per-process)
  • Open file table(shared by all processes)
  • v-node table(shared by all processes)

When a process opens a file, the kernel creates an entry in the per-process file descriptor table. Each entry contains a pointer to an entry in the open file table. Each entry in the open file table contains a pointer to an entry in the v-node table.

When a fork calls: the child process inherits copies of the parent's file descriptors. And the entry points to open file table's entry increasing refcnt.

IO redirection

for example: ls > foo.txt

Answer: dup2(oldfd, newfd) it means copies descriptor table entry oldfd to newfd so dup2(4, 1) makes stdout point to the same open file as descriptor 4.

stdio

The C standard library (libc.so) provides a collection of higher-level standard I/O functions.

  • fopen, fclose, fread, fwrite, fgets, fputs, fscanf, fprintf

stdio models open files as streams, which are abstraction for a file descriptor and a buffer in memory.

extern FILE * stdin;
extern FILE * stdout;
extern FILE * stderr;

Buffered I/O

Application often read and write one char at a time. However, UNIX System calls read and write calls expensive. So we need buffered read & write; use unix read & write to get a block of data into a buffer. And then user application reads/writes one char at a time from/to the buffer; it is efficient because it is simple memory access.

stdio uses buffer. printf is not write immediately to the stdout file; it is stored in a buffer. And then when fflush(stdout), exit, or return from main, the buffer is flushed to the file using write syscall.

Remark

  • UNIX IO
  • RIO package
  • stdio

When to use

  • stdio: disk or terminal files
  • unix io: signal handlers, or when you need absolute high performance
  • RIO: networking

Binary

DO NOT USE:

  • text oriented I/O: fgets, scanf, rio_readlineb
  • string functions: strlen, strcpy, strcat, strcmp