mirror of
https://github.com/denoland/std.git
synced 2024-11-21 12:40:03 +00:00
feat(archive): UntarStream
and TarStream
(#4548)
* refactor(archive): An implementation of Tar as streams * fmt(archive): Ran `deno fmt` * fix(archive): fixed JSDoc examples in tar_streams.ts * fix(archive): fixed JSDoc examples so `deno task test` doesn't complain * fix(archive): lint license error * fix(archive): lint error files not exported * set(archive): Set current time as mtime for default * resolve(archive): resolves comments made * add(archive): `{ mode: 'byob' }` support for TarStream * add(archive): `{ mode: 'byob' }` support for UnTarStream * adjust(archive): The logical flow of a few if statements * tests(archive): Updated Tests for Un/TarStream * fix(archive): TarStream mtime wasn't an octal * fix(archive): TarStream tests * add(archive): Added parsePathname function Added parsePathname function abstracting the logic out of TarStream allowing the developer to validate pathnames before providing them to TarStream hoping it doesn't throw an error and require the archive creation to start all over again. * fix(archive): extra bytes incorrectly appending at the end of files When the appending file was exactly divisible by 512 bytes, an extra 512 bytes was being appending instead of zero to fill in the gap, causing the next header to be read at the wrong place. * adjust(archive): to always return the amount of bytes requested Instead of using enqueue, the leftover bytes are saved for later for the next buffer provided. * tweaks * fix * docs(archive): Link to the spec that they're following * docs(archive): fix spelling * add(archive): function validTarSteamOptions - To make sure, if TarStreamOptions are being provided, that they are in the correct format so as to not create bad tarballs. * add(archive): more tests * fix(archive): validTarStreamOptions * add(archive): tests for validTarStreamOptions * refactor(archive): code to copy the changes made in the @doctor/tar-stream version * test(archive): added from @doctor/tar-stream * chore: nit on anonymous function * refactor(archive): UnTarStream that fixes unexplainable memory leak - The second newest test introduced here '... with invalid ending' seems to detect a memory leak due to an invalid tarball. I couldn't figure out why the memory leak was happening but I know this restructure of the code doesn't have that same memory leak. * chore: fmt * tests(archive): added remaining tests to cover many lines as possible * adjust(archive): remove simplify pathname code * adjust(archive): remove checking for duplicate pathnames in taring process * adjust(archive): A readable will exist on TarEntry unless string values 1-6 * tests(archive): added more tests for higher coverage * adjust(archives): TarStream and UnTarStream to implement TransformStream * docs(archive): moved TarStreamOptions docs into properties. * adjust(archive): TarStreamFile to take a ReadableSteam instead of an Iterable | AsyncIterable * adjust(archive): to use FixedChunkStream instead of rolling it's own implementation * fix(archive): lint error * adjust(archive): Error types and messages * adjust(archive): more Error messages / improve tests * refactor(archive): UnTarStream to return TarStreamChunk instead of TarStreamEntry * fix(archive): JSDoc example * adjust(archive): mode, uid, gid options to be provided as numbers instead of strings. * adjust(archive): TarStream's pathname to be only of type string * fix(archive): prefix/name to ignore everything past the first NULL * adjust(archive): `checksum` and `pad` to not be exposed from UnTarStream * adjust(archive): checksum calculation * change(archive): `.slice` to `.subarray` * doc(archive): "octal number" to "octal literal" * adjust(archive): TarStreamOptions to be optional with defaults * doc(archive): added more docs for the interfaces * docs(archive): denoting defaults * docs(archive): updated for new lint rules * adjust(archive): Tests to use assertRejects where appropriate & add `validPathname` function - The `validPathname` is meant to be a nicer exposed function for users of this lib to validate that their pathnames are valid before pipping it through the TarStream, over exposing parsePathname where the user may be confused about what to do with the result. * adjust(archive): to use `Date.now()` instead of `new Date().getTime()` Co-authored-by: ud2 <sjx233@qq.com> * adjust(archive): mode, uid, and gid to be numbers instead of strings when Untaring * tests(archive): adjust two tests to also validate the contents of the files are valid * adjust(archive): linkname, uname, and gname to follow the same decoding rules as name and prefix * rename(archive): UnTarStream to UntarStream * fix(archive): type that was missed getting updated * tests(archive): adjust check headers test to validate all header properties instead of just pathnames * rename(archive): `pathname` properties to `path` * docs(archive): updated to be more descriptive * docs(archive): Updated error types * adjust(archive): `validPath` to `assertValidPath` * adjust(archive): `validTarStreamOptions` to `assertValidTarStreamOptions` * revert(archive): UntarStream to produce TarStreamEntry instead of TarStreamChunk * refactor: remove redundant `void` return types * docs: cleanup assertion function docs * docs: correct `TarStream` example * docs: minor docs cleanups * refactor: improve error class specificity * docs: add `@experimental` JSDoc tags * docs(archive): Updated examples for `assertValidPath` and `assertValidTarStreamOptions``` * fix(archive): problem with tests - I suspect the problem is that a file that was read by `Deno.readDir` changed size between being read at `Deno.stat` and when `Deno.open` finished pulling it all in. * update error messages * update error messages * fix typos * refactor: tweak error messages * refactor: tweaks and add type field --------- Co-authored-by: Asher Gomez <ashersaupingomez@gmail.com> Co-authored-by: ud2 <sjx233@qq.com> Co-authored-by: Yoshiya Hinosawa <stibium121@gmail.com>
This commit is contained in:
parent
aa757d8803
commit
9298ea503f
@ -3,7 +3,9 @@
|
||||
"version": "0.225.1",
|
||||
"exports": {
|
||||
".": "./mod.ts",
|
||||
"./tar-stream": "./tar_stream.ts",
|
||||
"./tar": "./tar.ts",
|
||||
"./untar-stream": "./untar_stream.ts",
|
||||
"./untar": "./untar.ts"
|
||||
}
|
||||
}
|
||||
|
@ -68,3 +68,5 @@
|
||||
*/
|
||||
export * from "./tar.ts";
|
||||
export * from "./untar.ts";
|
||||
export * from "./tar_stream.ts";
|
||||
export * from "./untar_stream.ts";
|
||||
|
569
archive/tar_stream.ts
Normal file
569
archive/tar_stream.ts
Normal file
@ -0,0 +1,569 @@
|
||||
// Copyright 2018-2024 the Deno authors. All rights reserved. MIT license.
|
||||
|
||||
/**
|
||||
* The interface required to provide a file.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*/
|
||||
export interface TarStreamFile {
|
||||
/**
|
||||
* The type of the input.
|
||||
*/
|
||||
type: "file";
|
||||
/**
|
||||
* The path to the file, relative to the archive's root directory.
|
||||
*/
|
||||
path: string;
|
||||
/**
|
||||
* The size of the file in bytes.
|
||||
*/
|
||||
size: number;
|
||||
/**
|
||||
* The contents of the file.
|
||||
*/
|
||||
readable: ReadableStream<Uint8Array>;
|
||||
/**
|
||||
* The metadata of the file.
|
||||
*/
|
||||
options?: TarStreamOptions;
|
||||
}
|
||||
|
||||
/**
|
||||
* The interface required to provide a directory.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*/
|
||||
export interface TarStreamDir {
|
||||
/**
|
||||
* The type of the input.
|
||||
*/
|
||||
type: "directory";
|
||||
/**
|
||||
* The path of the directory, relative to the archive's root directory.
|
||||
*/
|
||||
path: string;
|
||||
/**
|
||||
* The metadata of the directory.
|
||||
*/
|
||||
options?: TarStreamOptions;
|
||||
}
|
||||
|
||||
/**
|
||||
* A union type merging all the TarStream interfaces that can be piped into the
|
||||
* TarStream class.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*/
|
||||
export type TarStreamInput = TarStreamFile | TarStreamDir;
|
||||
|
||||
type TarStreamInputInternal =
|
||||
& (Omit<TarStreamFile, "path"> | Omit<TarStreamDir, "path">)
|
||||
& { path: [Uint8Array, Uint8Array] };
|
||||
|
||||
/**
|
||||
* The options that can go along with a file or directory.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*/
|
||||
export interface TarStreamOptions {
|
||||
/**
|
||||
* An octal literal.
|
||||
* Defaults to 755 for directories and 644 for files.
|
||||
*/
|
||||
mode?: number;
|
||||
/**
|
||||
* An octal literal.
|
||||
* @default {0}
|
||||
*/
|
||||
uid?: number;
|
||||
/**
|
||||
* An octal literal.
|
||||
* @default {0}
|
||||
*/
|
||||
gid?: number;
|
||||
/**
|
||||
* A number of seconds since the start of epoch. Avoid negative values.
|
||||
* Defaults to the current time in seconds.
|
||||
*/
|
||||
mtime?: number;
|
||||
/**
|
||||
* An ASCII string. Should be used in preference of uid.
|
||||
* @default {''}
|
||||
*/
|
||||
uname?: string;
|
||||
/**
|
||||
* An ASCII string. Should be used in preference of gid.
|
||||
* @default {''}
|
||||
*/
|
||||
gname?: string;
|
||||
/**
|
||||
* The major number for character device.
|
||||
* @default {''}
|
||||
*/
|
||||
devmajor?: string;
|
||||
/**
|
||||
* The minor number for block device entry.
|
||||
* @default {''}
|
||||
*/
|
||||
devminor?: string;
|
||||
}
|
||||
|
||||
const SLASH_CODE_POINT = "/".charCodeAt(0);
|
||||
|
||||
/**
|
||||
* ### Overview
|
||||
* A TransformStream to create a tar archive. Tar archives allow for storing
|
||||
* multiple files in a single file (called an archive, or sometimes a tarball).
|
||||
* These archives typically have a single '.tar' extension. This
|
||||
* implementation follows the [FreeBSD 15.0](https://man.freebsd.org/cgi/man.cgi?query=tar&sektion=5&apropos=0&manpath=FreeBSD+15.0-CURRENT) spec.
|
||||
*
|
||||
* ### File Format & Limitations
|
||||
* The ustar file format is used for creating the tar archive. While this
|
||||
* format is compatible with most tar readers, the format has several
|
||||
* limitations, including:
|
||||
* - Paths must be at most 256 characters.
|
||||
* - Files must be at most 8 GiBs in size, or 64 GiBs if `sizeExtension` is set
|
||||
* to true.
|
||||
* - Sparse files are not supported.
|
||||
*
|
||||
* ### Usage
|
||||
* TarStream may throw an error for several reasons. A few of those are:
|
||||
* - The path is invalid.
|
||||
* - The size provided does not match that of the iterable's length.
|
||||
*
|
||||
* ### Compression
|
||||
* Tar archives are not compressed by default. If you'd like to compress the
|
||||
* archive, you may do so by piping it through a compression stream.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*
|
||||
* @example Usage
|
||||
* ```ts no-eval
|
||||
* import { TarStream, type TarStreamInput } from "@std/archive/tar-stream";
|
||||
*
|
||||
* await ReadableStream.from<TarStreamInput>([
|
||||
* {
|
||||
* type: "directory",
|
||||
* path: 'potato/'
|
||||
* },
|
||||
* {
|
||||
* type: "file",
|
||||
* path: 'deno.json',
|
||||
* size: (await Deno.stat('deno.json')).size,
|
||||
* readable: (await Deno.open('deno.json')).readable
|
||||
* },
|
||||
* {
|
||||
* type: "file",
|
||||
* path: '.vscode/settings.json',
|
||||
* size: (await Deno.stat('.vscode/settings.json')).size,
|
||||
* readable: (await Deno.open('.vscode/settings.json')).readable
|
||||
* }
|
||||
* ])
|
||||
* .pipeThrough(new TarStream())
|
||||
* .pipeThrough(new CompressionStream('gzip'))
|
||||
* .pipeTo((await Deno.create('./out.tar.gz')).writable)
|
||||
* ```
|
||||
*/
|
||||
|
||||
export class TarStream implements TransformStream<TarStreamInput, Uint8Array> {
|
||||
#readable: ReadableStream<Uint8Array>;
|
||||
#writable: WritableStream<TarStreamInput>;
|
||||
constructor() {
|
||||
const { readable, writable } = new TransformStream<
|
||||
TarStreamInput,
|
||||
TarStreamInputInternal
|
||||
>({
|
||||
transform(chunk, controller) {
|
||||
if (chunk.options) {
|
||||
try {
|
||||
assertValidTarStreamOptions(chunk.options);
|
||||
} catch (e) {
|
||||
return controller.error(e);
|
||||
}
|
||||
}
|
||||
|
||||
if (
|
||||
"size" in chunk &&
|
||||
(chunk.size < 0 || 8 ** 12 < chunk.size ||
|
||||
chunk.size.toString() === "NaN")
|
||||
) {
|
||||
return controller.error(
|
||||
new RangeError(
|
||||
"Cannot add to the tar archive: The size cannot exceed 64 Gibs",
|
||||
),
|
||||
);
|
||||
}
|
||||
|
||||
const path = parsePath(chunk.path);
|
||||
|
||||
controller.enqueue({ ...chunk, path });
|
||||
},
|
||||
});
|
||||
this.#writable = writable;
|
||||
const gen = async function* () {
|
||||
const encoder = new TextEncoder();
|
||||
for await (const chunk of readable) {
|
||||
const [prefix, name] = chunk.path;
|
||||
const typeflag = "size" in chunk ? "0" : "5";
|
||||
const header = new Uint8Array(512);
|
||||
const size = "size" in chunk ? chunk.size : 0;
|
||||
const options: Required<TarStreamOptions> = {
|
||||
mode: typeflag === "5" ? 755 : 644,
|
||||
uid: 0,
|
||||
gid: 0,
|
||||
mtime: Math.floor(Date.now() / 1000),
|
||||
uname: "",
|
||||
gname: "",
|
||||
devmajor: "",
|
||||
devminor: "",
|
||||
...chunk.options,
|
||||
};
|
||||
|
||||
header.set(name); // name
|
||||
header.set(
|
||||
encoder.encode(
|
||||
options.mode.toString().padStart(6, "0") + " \0" + // mode
|
||||
options.uid.toString().padStart(6, "0") + " \0" + //uid
|
||||
options.gid.toString().padStart(6, "0") + " \0" + // gid
|
||||
size.toString(8).padStart(size < 8 ** 11 ? 11 : 12, "0") +
|
||||
(size < 8 ** 11 ? " " : "") + // size
|
||||
options.mtime.toString(8).padStart(11, "0") + " " + // mtime
|
||||
" ".repeat(8) + // checksum | To be updated later
|
||||
typeflag + // typeflag
|
||||
"\0".repeat(100) + // linkname
|
||||
"ustar\0" + // magic
|
||||
"00" + // version
|
||||
options.uname.padEnd(32, "\0") + // uname
|
||||
options.gname.padEnd(32, "\0") + // gname
|
||||
options.devmajor.padStart(8, "\0") + // devmajor
|
||||
options.devminor.padStart(8, "\0"), // devminor
|
||||
),
|
||||
100,
|
||||
);
|
||||
header.set(prefix, 345); // prefix
|
||||
// Update Checksum
|
||||
header.set(
|
||||
encoder.encode(
|
||||
header.reduce((x, y) => x + y).toString(8).padStart(6, "0") + "\0",
|
||||
),
|
||||
148,
|
||||
);
|
||||
yield header;
|
||||
|
||||
if ("size" in chunk) {
|
||||
let size = 0;
|
||||
for await (const value of chunk.readable) {
|
||||
size += value.length;
|
||||
yield value;
|
||||
}
|
||||
if (chunk.size !== size) {
|
||||
throw new RangeError(
|
||||
`Cannot add to the tar archive: The provided size (${chunk.size}) did not match bytes read from provided readable (${size})`,
|
||||
);
|
||||
}
|
||||
if (chunk.size % 512) {
|
||||
yield new Uint8Array(512 - size % 512);
|
||||
}
|
||||
}
|
||||
}
|
||||
yield new Uint8Array(1024);
|
||||
}();
|
||||
this.#readable = new ReadableStream({
|
||||
type: "bytes",
|
||||
async pull(controller) {
|
||||
const { done, value } = await gen.next();
|
||||
if (done) {
|
||||
controller.close();
|
||||
return controller.byobRequest?.respond(0);
|
||||
}
|
||||
if (controller.byobRequest?.view) {
|
||||
const buffer = new Uint8Array(controller.byobRequest.view.buffer);
|
||||
|
||||
const size = buffer.length;
|
||||
if (size < value.length) {
|
||||
buffer.set(value.slice(0, size));
|
||||
controller.byobRequest.respond(size);
|
||||
controller.enqueue(value.slice(size));
|
||||
} else {
|
||||
buffer.set(value);
|
||||
controller.byobRequest.respond(value.length);
|
||||
}
|
||||
} else {
|
||||
controller.enqueue(value);
|
||||
}
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* The ReadableStream
|
||||
*
|
||||
* @return ReadableStream<Uint8Array>
|
||||
*
|
||||
* @example Usage
|
||||
* ```ts no-eval
|
||||
* import { TarStream } from "@std/archive/tar-stream";
|
||||
*
|
||||
* await ReadableStream.from([
|
||||
* {
|
||||
* type: "directory",
|
||||
* path: 'potato/'
|
||||
* },
|
||||
* {
|
||||
* type: "file",
|
||||
* path: 'deno.json',
|
||||
* size: (await Deno.stat('deno.json')).size,
|
||||
* readable: (await Deno.open('deno.json')).readable
|
||||
* },
|
||||
* {
|
||||
* type: "file",
|
||||
* path: '.vscode/settings.json',
|
||||
* size: (await Deno.stat('.vscode/settings.json')).size,
|
||||
* readable: (await Deno.open('.vscode/settings.json')).readable
|
||||
* }
|
||||
* ])
|
||||
* .pipeThrough(new TarStream())
|
||||
* .pipeThrough(new CompressionStream('gzip'))
|
||||
* .pipeTo((await Deno.create('./out.tar.gz')).writable)
|
||||
* ```
|
||||
*/
|
||||
get readable(): ReadableStream<Uint8Array> {
|
||||
return this.#readable;
|
||||
}
|
||||
|
||||
/**
|
||||
* The WritableStream
|
||||
*
|
||||
* @return WritableStream<TarStreamInput>
|
||||
*
|
||||
* @example Usage
|
||||
* ```ts no-eval
|
||||
* import { TarStream } from "@std/archive/tar-stream";
|
||||
*
|
||||
* await ReadableStream.from([
|
||||
* {
|
||||
* type: "directory",
|
||||
* path: 'potato/'
|
||||
* },
|
||||
* {
|
||||
* type: "file",
|
||||
* path: 'deno.json',
|
||||
* size: (await Deno.stat('deno.json')).size,
|
||||
* readable: (await Deno.open('deno.json')).readable
|
||||
* },
|
||||
* {
|
||||
* type: "file",
|
||||
* path: '.vscode/settings.json',
|
||||
* size: (await Deno.stat('.vscode/settings.json')).size,
|
||||
* readable: (await Deno.open('.vscode/settings.json')).readable
|
||||
* }
|
||||
* ])
|
||||
* .pipeThrough(new TarStream())
|
||||
* .pipeThrough(new CompressionStream('gzip'))
|
||||
* .pipeTo((await Deno.create('./out.tar.gz')).writable)
|
||||
* ```
|
||||
*/
|
||||
get writable(): WritableStream<TarStreamInput> {
|
||||
return this.#writable;
|
||||
}
|
||||
}
|
||||
|
||||
function parsePath(
|
||||
path: string,
|
||||
): [Uint8Array, Uint8Array] {
|
||||
const name = new TextEncoder().encode(path);
|
||||
if (name.length <= 100) {
|
||||
return [new Uint8Array(0), name];
|
||||
}
|
||||
|
||||
if (name.length > 256) {
|
||||
throw new RangeError(
|
||||
`Cannot parse the path as the path length cannot exceed 256 bytes: The path length is ${name.length}`,
|
||||
);
|
||||
}
|
||||
|
||||
// If length of last part is > 100, then there's no possible answer to split the path
|
||||
let suitableSlashPos = Math.max(0, name.lastIndexOf(SLASH_CODE_POINT)); // always holds position of '/'
|
||||
if (name.length - suitableSlashPos > 100) {
|
||||
throw new RangeError(
|
||||
`Cannot parse the path as the filename cannot exceed 100 bytes: The filename length is ${
|
||||
name.length - suitableSlashPos
|
||||
}`,
|
||||
);
|
||||
}
|
||||
|
||||
for (
|
||||
let nextPos = suitableSlashPos;
|
||||
nextPos > 0;
|
||||
suitableSlashPos = nextPos
|
||||
) {
|
||||
// disclaimer: '/' won't appear at pos 0, so nextPos always be > 0 or = -1
|
||||
nextPos = name.lastIndexOf(SLASH_CODE_POINT, suitableSlashPos - 1);
|
||||
// disclaimer: since name.length > 100 in this case, if nextPos = -1, name.length - nextPos will also > 100
|
||||
if (name.length - nextPos > 100) {
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
const prefix = name.slice(0, suitableSlashPos);
|
||||
if (prefix.length > 155) {
|
||||
throw new TypeError(
|
||||
"Cannot parse the path as the path needs to be split-able on a forward slash separator into [155, 100] bytes respectively",
|
||||
);
|
||||
}
|
||||
return [prefix, name.slice(suitableSlashPos + 1)];
|
||||
}
|
||||
|
||||
/**
|
||||
* Asserts that the path provided is valid for a {@linkcode TarStream}.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*
|
||||
* It provides a means to check that a path is valid before pipping it through
|
||||
* the `TarStream`, where if invalid will throw an error. Ruining any progress
|
||||
* made when archiving.
|
||||
*
|
||||
* @param path The path as a string
|
||||
*
|
||||
* @example Usage
|
||||
* ```ts no-assert no-eval
|
||||
* import { assertValidPath, TarStream, type TarStreamInput } from "@std/archive";
|
||||
*
|
||||
* const paths = (await Array.fromAsync(Deno.readDir("./")))
|
||||
* .filter(entry => entry.isFile)
|
||||
* .map((entry) => entry.name)
|
||||
* // Filter out any paths that are invalid as they are to be placed inside a Tar.
|
||||
* .filter(path => {
|
||||
* try {
|
||||
* assertValidPath(path);
|
||||
* return true;
|
||||
* } catch (error) {
|
||||
* console.error(error);
|
||||
* return false;
|
||||
* }
|
||||
* });
|
||||
*
|
||||
* await Deno.mkdir('./out/', { recursive: true })
|
||||
* await ReadableStream.from(paths)
|
||||
* .pipeThrough(
|
||||
* new TransformStream<string, TarStreamInput>({
|
||||
* async transform(path, controller) {
|
||||
* controller.enqueue({
|
||||
* type: "file",
|
||||
* path,
|
||||
* size: (await Deno.stat(path)).size,
|
||||
* readable: (await Deno.open(path)).readable,
|
||||
* });
|
||||
* },
|
||||
* }),
|
||||
* )
|
||||
* .pipeThrough(new TarStream())
|
||||
* .pipeThrough(new CompressionStream('gzip'))
|
||||
* .pipeTo((await Deno.create('./out/archive.tar.gz')).writable);
|
||||
* ```
|
||||
*/
|
||||
export function assertValidPath(path: string) {
|
||||
parsePath(path);
|
||||
}
|
||||
|
||||
/**
|
||||
* Asserts that the options provided are valid for a {@linkcode TarStream}.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*
|
||||
* @param options The TarStreamOptions
|
||||
*
|
||||
* @example Usage
|
||||
* ```ts no-assert no-eval
|
||||
* import { assertValidTarStreamOptions, TarStream, type TarStreamInput } from "@std/archive";
|
||||
*
|
||||
* const paths = (await Array.fromAsync(Deno.readDir('./')))
|
||||
* .filter(entry => entry.isFile)
|
||||
* .map(entry => entry.name);
|
||||
*
|
||||
* await Deno.mkdir('./out/', { recursive: true })
|
||||
* await ReadableStream.from(paths)
|
||||
* .pipeThrough(new TransformStream<string, TarStreamInput>({
|
||||
* async transform(path, controller) {
|
||||
* const stats = await Deno.stat(path);
|
||||
* const options = { mtime: stats.mtime?.getTime()! / 1000 };
|
||||
* try {
|
||||
* // Filter out any paths that would have an invalid options provided.
|
||||
* assertValidTarStreamOptions(options);
|
||||
* controller.enqueue({
|
||||
* type: "file",
|
||||
* path,
|
||||
* size: stats.size,
|
||||
* readable: (await Deno.open(path)).readable,
|
||||
* options,
|
||||
* });
|
||||
* } catch (error) {
|
||||
* console.error(error);
|
||||
* }
|
||||
* },
|
||||
* }))
|
||||
* .pipeThrough(new TarStream())
|
||||
* .pipeThrough(new CompressionStream('gzip'))
|
||||
* .pipeTo((await Deno.create('./out/archive.tar.gz')).writable);
|
||||
* ```
|
||||
*/
|
||||
export function assertValidTarStreamOptions(options: TarStreamOptions) {
|
||||
if (
|
||||
options.mode &&
|
||||
(options.mode.toString().length > 6 ||
|
||||
!/^[0-7]*$/.test(options.mode.toString()))
|
||||
) throw new TypeError("Cannot add to the tar archive: Invalid Mode provided");
|
||||
if (
|
||||
options.uid &&
|
||||
(options.uid.toString().length > 6 ||
|
||||
!/^[0-7]*$/.test(options.uid.toString()))
|
||||
) throw new TypeError("Cannot add to the tar archive: Invalid UID provided");
|
||||
if (
|
||||
options.gid &&
|
||||
(options.gid.toString().length > 6 ||
|
||||
!/^[0-7]*$/.test(options.gid.toString()))
|
||||
) throw new TypeError("Cannot add to the tar archive: Invalid GID provided");
|
||||
if (
|
||||
options.mtime != undefined &&
|
||||
(options.mtime.toString(8).length > 11 ||
|
||||
options.mtime.toString() === "NaN")
|
||||
) {
|
||||
throw new TypeError(
|
||||
"Cannot add to the tar archive: Invalid MTime provided",
|
||||
);
|
||||
}
|
||||
if (
|
||||
options.uname &&
|
||||
// deno-lint-ignore no-control-regex
|
||||
(options.uname.length > 32 - 1 || !/^[\x00-\x7F]*$/.test(options.uname))
|
||||
) {
|
||||
throw new TypeError(
|
||||
"Cannot add to the tar archive: Invalid UName provided",
|
||||
);
|
||||
}
|
||||
if (
|
||||
options.gname &&
|
||||
// deno-lint-ignore no-control-regex
|
||||
(options.gname.length > 32 - 1 || !/^[\x00-\x7F]*$/.test(options.gname))
|
||||
) {
|
||||
throw new TypeError(
|
||||
"Cannot add to the tar archive: Invalid GName provided",
|
||||
);
|
||||
}
|
||||
if (
|
||||
options.devmajor &&
|
||||
(options.devmajor.length > 8)
|
||||
) {
|
||||
throw new TypeError(
|
||||
"Cannot add to the tar archive: Invalid DevMajor provided",
|
||||
);
|
||||
}
|
||||
if (
|
||||
options.devminor &&
|
||||
(options.devminor.length > 8)
|
||||
) {
|
||||
throw new TypeError(
|
||||
"Cannot add to the tar archive: Invalid DevMinor provided",
|
||||
);
|
||||
}
|
||||
}
|
356
archive/tar_stream_test.ts
Normal file
356
archive/tar_stream_test.ts
Normal file
@ -0,0 +1,356 @@
|
||||
// Copyright 2018-2024 the Deno authors. All rights reserved. MIT license.
|
||||
import {
|
||||
assertValidTarStreamOptions,
|
||||
TarStream,
|
||||
type TarStreamInput,
|
||||
} from "./tar_stream.ts";
|
||||
import { assertEquals, assertRejects, assertThrows } from "../assert/mod.ts";
|
||||
import { UntarStream } from "./untar_stream.ts";
|
||||
import { concat } from "../bytes/mod.ts";
|
||||
|
||||
Deno.test("TarStream() with default stream", async () => {
|
||||
const text = new TextEncoder().encode("Hello World!");
|
||||
|
||||
const reader = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "directory",
|
||||
path: "./potato",
|
||||
},
|
||||
{
|
||||
type: "file",
|
||||
path: "./text.txt",
|
||||
size: text.length,
|
||||
readable: ReadableStream.from([text.slice()]),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream())
|
||||
.getReader();
|
||||
|
||||
let size = 0;
|
||||
const data: Uint8Array[] = [];
|
||||
while (true) {
|
||||
const { done, value } = await reader.read();
|
||||
if (done) {
|
||||
break;
|
||||
}
|
||||
size += value.length;
|
||||
data.push(value);
|
||||
}
|
||||
assertEquals(size, 512 + 512 + Math.ceil(text.length / 512) * 512 + 1024);
|
||||
assertEquals(
|
||||
text,
|
||||
concat(data).slice(
|
||||
512 + // Slicing off ./potato header
|
||||
512, // Slicing off ./text.txt header
|
||||
-1024, // Slicing off 1024 bytes of end padding
|
||||
)
|
||||
.slice(0, text.length), // Slice off padding added to text to make it divisible by 512
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("TarStream() with byte stream", async () => {
|
||||
const text = new TextEncoder().encode("Hello World!");
|
||||
|
||||
const reader = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "directory",
|
||||
path: "./potato",
|
||||
},
|
||||
{
|
||||
type: "file",
|
||||
path: "./text.txt",
|
||||
size: text.length,
|
||||
readable: ReadableStream.from([text.slice()]),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream())
|
||||
.getReader({ mode: "byob" });
|
||||
|
||||
let size = 0;
|
||||
const data: Uint8Array[] = [];
|
||||
while (true) {
|
||||
const { done, value } = await reader.read(
|
||||
new Uint8Array(Math.ceil(Math.random() * 1024)),
|
||||
);
|
||||
if (done) {
|
||||
break;
|
||||
}
|
||||
size += value.length;
|
||||
data.push(value);
|
||||
}
|
||||
assertEquals(size, 512 + 512 + Math.ceil(text.length / 512) * 512 + 1024);
|
||||
assertEquals(
|
||||
text,
|
||||
concat(data).slice(
|
||||
512 + // Slicing off ./potato header
|
||||
512, // Slicing off ./text.txt header
|
||||
-1024, // Slicing off 1024 bytes of end padding
|
||||
)
|
||||
.slice(0, text.length), // Slice off padding added to text to make it divisible by 512
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("TarStream() with negative size", async () => {
|
||||
const text = new TextEncoder().encode("Hello World");
|
||||
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "file",
|
||||
path: "name",
|
||||
size: -text.length,
|
||||
readable: ReadableStream.from([text.slice()]),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream());
|
||||
|
||||
await assertRejects(
|
||||
() => Array.fromAsync(readable),
|
||||
RangeError,
|
||||
"Cannot add to the tar archive: The size cannot exceed 64 Gibs",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("TarStream() with 65 GiB size", async () => {
|
||||
const size = 1024 ** 3 * 65;
|
||||
const step = 1024; // Size must equally be divisible by step
|
||||
const iterable = function* () {
|
||||
for (let i = 0; i < size; i += step) {
|
||||
yield new Uint8Array(step).map(() => Math.floor(Math.random() * 256));
|
||||
}
|
||||
}();
|
||||
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "file",
|
||||
path: "name",
|
||||
size,
|
||||
readable: ReadableStream.from(iterable),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream());
|
||||
|
||||
await assertRejects(
|
||||
() => Array.fromAsync(readable),
|
||||
RangeError,
|
||||
"Cannot add to the tar archive: The size cannot exceed 64 Gibs",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("TarStream() with NaN size", async () => {
|
||||
const size = NaN;
|
||||
const step = 1024; // Size must equally be divisible by step
|
||||
const iterable = function* () {
|
||||
for (let i = 0; i < size; i += step) {
|
||||
yield new Uint8Array(step).map(() => Math.floor(Math.random() * 256));
|
||||
}
|
||||
}();
|
||||
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "file",
|
||||
path: "name",
|
||||
size,
|
||||
readable: ReadableStream.from(iterable),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream());
|
||||
|
||||
await assertRejects(
|
||||
() => Array.fromAsync(readable),
|
||||
RangeError,
|
||||
"Cannot add to the tar archive: The size cannot exceed 64 Gibs",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("parsePath()", async () => {
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "directory",
|
||||
path:
|
||||
"./Veeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeery/LongPath",
|
||||
},
|
||||
{
|
||||
type: "directory",
|
||||
path:
|
||||
"./some random path/with/loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong/path",
|
||||
},
|
||||
{
|
||||
type: "directory",
|
||||
path:
|
||||
"./some random path/with/loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong/file",
|
||||
},
|
||||
{
|
||||
type: "directory",
|
||||
path:
|
||||
"./some random path/with/loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong/file",
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream())
|
||||
.pipeThrough(new UntarStream());
|
||||
|
||||
const output = [
|
||||
"./Veeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeery/LongPath",
|
||||
"./some random path/with/loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong/path",
|
||||
"./some random path/with/loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong/file",
|
||||
"./some random path/with/loooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooooong/file",
|
||||
];
|
||||
for await (const tarEntry of readable) {
|
||||
assertEquals(tarEntry.path, output.shift());
|
||||
tarEntry.readable?.cancel();
|
||||
}
|
||||
});
|
||||
|
||||
Deno.test("validTarStreamOptions()", () => {
|
||||
assertValidTarStreamOptions({});
|
||||
|
||||
assertValidTarStreamOptions({ mode: 0 });
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ mode: 8 }),
|
||||
TypeError,
|
||||
"Invalid Mode provided",
|
||||
);
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ mode: 1111111 }),
|
||||
TypeError,
|
||||
"Invalid Mode provided",
|
||||
);
|
||||
|
||||
assertValidTarStreamOptions({ uid: 0 });
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ uid: 8 }),
|
||||
TypeError,
|
||||
"Invalid UID provided",
|
||||
);
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ uid: 1111111 }),
|
||||
TypeError,
|
||||
"Invalid UID provided",
|
||||
);
|
||||
|
||||
assertValidTarStreamOptions({ gid: 0 });
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ gid: 8 }),
|
||||
TypeError,
|
||||
"Invalid GID provided",
|
||||
);
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ gid: 1111111 }),
|
||||
TypeError,
|
||||
"Invalid GID provided",
|
||||
);
|
||||
|
||||
assertValidTarStreamOptions({ mtime: 0 });
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ mtime: NaN }),
|
||||
TypeError,
|
||||
"Invalid MTime provided",
|
||||
);
|
||||
assertValidTarStreamOptions({
|
||||
mtime: Math.floor(new Date().getTime() / 1000),
|
||||
});
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ mtime: new Date().getTime() }),
|
||||
TypeError,
|
||||
"Invalid MTime provided",
|
||||
);
|
||||
|
||||
assertValidTarStreamOptions({ uname: "" });
|
||||
assertValidTarStreamOptions({ uname: "abcdef" });
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ uname: "å-abcdef" }),
|
||||
TypeError,
|
||||
"Invalid UName provided",
|
||||
);
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ uname: "a".repeat(100) }),
|
||||
TypeError,
|
||||
"Invalid UName provided",
|
||||
);
|
||||
|
||||
assertValidTarStreamOptions({ gname: "" });
|
||||
assertValidTarStreamOptions({ gname: "abcdef" });
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ gname: "å-abcdef" }),
|
||||
TypeError,
|
||||
"Invalid GName provided",
|
||||
);
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ gname: "a".repeat(100) }),
|
||||
TypeError,
|
||||
"Invalid GName provided",
|
||||
);
|
||||
|
||||
assertValidTarStreamOptions({ devmajor: "" });
|
||||
assertValidTarStreamOptions({ devmajor: "1234" });
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ devmajor: "123456789" }),
|
||||
TypeError,
|
||||
"Invalid DevMajor provided",
|
||||
);
|
||||
|
||||
assertValidTarStreamOptions({ devminor: "" });
|
||||
assertValidTarStreamOptions({ devminor: "1234" });
|
||||
assertThrows(
|
||||
() => assertValidTarStreamOptions({ devminor: "123456789" }),
|
||||
TypeError,
|
||||
"Invalid DevMinor provided",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("TarStream() with invalid options", async () => {
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{ type: "directory", path: "potato", options: { mode: 9 } },
|
||||
]).pipeThrough(new TarStream());
|
||||
|
||||
await assertRejects(
|
||||
() => Array.fromAsync(readable),
|
||||
TypeError,
|
||||
"Invalid Mode provided",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("TarStream() with mismatching sizes", async () => {
|
||||
const text = new TextEncoder().encode("Hello World!");
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "file",
|
||||
path: "potato",
|
||||
size: text.length + 1,
|
||||
readable: ReadableStream.from([text.slice()]),
|
||||
},
|
||||
]).pipeThrough(new TarStream());
|
||||
|
||||
await assertRejects(
|
||||
() => Array.fromAsync(readable),
|
||||
RangeError,
|
||||
"Cannot add to the tar archive: The provided size (13) did not match bytes read from provided readable (12)",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("parsePath() with too long path", async () => {
|
||||
const readable = ReadableStream.from<TarStreamInput>([{
|
||||
type: "directory",
|
||||
path: "0".repeat(300),
|
||||
}])
|
||||
.pipeThrough(new TarStream());
|
||||
|
||||
await assertRejects(
|
||||
() => Array.fromAsync(readable),
|
||||
RangeError,
|
||||
"Cannot parse the path as the path length cannot exceed 256 bytes: The path length is 300",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("parsePath() with too long path", async () => {
|
||||
const readable = ReadableStream.from<TarStreamInput>([{
|
||||
type: "directory",
|
||||
path: "0".repeat(160) + "/",
|
||||
}])
|
||||
.pipeThrough(new TarStream());
|
||||
|
||||
await assertRejects(
|
||||
() => Array.fromAsync(readable),
|
||||
TypeError,
|
||||
"Cannot parse the path as the path needs to be split-able on a forward slash separator into [155, 100] bytes respectively",
|
||||
);
|
||||
});
|
387
archive/untar_stream.ts
Normal file
387
archive/untar_stream.ts
Normal file
@ -0,0 +1,387 @@
|
||||
// Copyright 2018-2024 the Deno authors. All rights reserved. MIT license.
|
||||
import { FixedChunkStream } from "@std/streams";
|
||||
|
||||
/**
|
||||
* The original tar archive header format.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*/
|
||||
export interface OldStyleFormat {
|
||||
/**
|
||||
* The name of the entry.
|
||||
*/
|
||||
name: string;
|
||||
/**
|
||||
* The mode of the entry.
|
||||
*/
|
||||
mode: number;
|
||||
/**
|
||||
* The uid of the entry.
|
||||
*/
|
||||
uid: number;
|
||||
/**
|
||||
* The gid of the entry.
|
||||
*/
|
||||
gid: number;
|
||||
/**
|
||||
* The size of the entry.
|
||||
*/
|
||||
size: number;
|
||||
/**
|
||||
* The mtime of the entry.
|
||||
*/
|
||||
mtime: number;
|
||||
/**
|
||||
* The typeflag of the entry.
|
||||
*/
|
||||
typeflag: string;
|
||||
/**
|
||||
* The linkname of the entry.
|
||||
*/
|
||||
linkname: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* The POSIX ustar archive header format.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*/
|
||||
export interface PosixUstarFormat {
|
||||
/**
|
||||
* The latter half of the name of the entry.
|
||||
*/
|
||||
name: string;
|
||||
/**
|
||||
* The mode of the entry.
|
||||
*/
|
||||
mode: number;
|
||||
/**
|
||||
* The uid of the entry.
|
||||
*/
|
||||
uid: number;
|
||||
/**
|
||||
* The gid of the entry.
|
||||
*/
|
||||
gid: number;
|
||||
/**
|
||||
* The size of the entry.
|
||||
*/
|
||||
size: number;
|
||||
/**
|
||||
* The mtime of the entry.
|
||||
*/
|
||||
mtime: number;
|
||||
/**
|
||||
* The typeflag of the entry.
|
||||
*/
|
||||
typeflag: string;
|
||||
/**
|
||||
* The linkname of the entry.
|
||||
*/
|
||||
linkname: string;
|
||||
/**
|
||||
* The magic number of the entry.
|
||||
*/
|
||||
magic: string;
|
||||
/**
|
||||
* The version number of the entry.
|
||||
*/
|
||||
version: string;
|
||||
/**
|
||||
* The uname of the entry.
|
||||
*/
|
||||
uname: string;
|
||||
/**
|
||||
* The gname of the entry.
|
||||
*/
|
||||
gname: string;
|
||||
/**
|
||||
* The devmajor of the entry.
|
||||
*/
|
||||
devmajor: string;
|
||||
/**
|
||||
* The devminor of the entry.
|
||||
*/
|
||||
devminor: string;
|
||||
/**
|
||||
* The former half of the name of the entry.
|
||||
*/
|
||||
prefix: string;
|
||||
}
|
||||
|
||||
/**
|
||||
* The structure of an entry extracted from a Tar archive.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*/
|
||||
export interface TarStreamEntry {
|
||||
/**
|
||||
* The header information attributed to the entry, presented in one of two
|
||||
* valid forms.
|
||||
*/
|
||||
header: OldStyleFormat | PosixUstarFormat;
|
||||
/**
|
||||
* The path of the entry as stated in the archive.
|
||||
*/
|
||||
path: string;
|
||||
/**
|
||||
* The content of the entry, if the entry is a file.
|
||||
*/
|
||||
readable?: ReadableStream<Uint8Array>;
|
||||
}
|
||||
|
||||
/**
|
||||
* ### Overview
|
||||
* A TransformStream to expand a tar archive. Tar archives allow for storing
|
||||
* multiple files in a single file (called an archive, or sometimes a tarball).
|
||||
*
|
||||
* These archives typically have a single '.tar' extension. This
|
||||
* implementation follows the [FreeBSD 15.0](https://man.freebsd.org/cgi/man.cgi?query=tar&sektion=5&apropos=0&manpath=FreeBSD+15.0-CURRENT) spec.
|
||||
*
|
||||
* ### Supported File Formats
|
||||
* Only the ustar file format is supported. This is the most common format.
|
||||
* Additionally the numeric extension for file size.
|
||||
*
|
||||
* ### Usage
|
||||
* When expanding the archive, as demonstrated in the example, one must decide
|
||||
* to either consume the ReadableStream property, if present, or cancel it. The
|
||||
* next entry won't be resolved until the previous ReadableStream is either
|
||||
* consumed or cancelled.
|
||||
*
|
||||
* ### Understanding Compressed
|
||||
* A tar archive may be compressed, often identified by an additional file
|
||||
* extension, such as '.tar.gz' for gzip. This TransformStream does not support
|
||||
* decompression which must be done before expanding the archive.
|
||||
*
|
||||
* @experimental **UNSTABLE**: New API, yet to be vetted.
|
||||
*
|
||||
* @example Usage
|
||||
* ```ts no-eval
|
||||
* import { UntarStream } from "@std/archive/untar-stream";
|
||||
* import { dirname, normalize } from "@std/path";
|
||||
*
|
||||
* for await (
|
||||
* const entry of (await Deno.open("./out.tar.gz"))
|
||||
* .readable
|
||||
* .pipeThrough(new DecompressionStream("gzip"))
|
||||
* .pipeThrough(new UntarStream())
|
||||
* ) {
|
||||
* const path = normalize(entry.path);
|
||||
* await Deno.mkdir(dirname(path));
|
||||
* await entry.readable?.pipeTo((await Deno.create(path)).writable);
|
||||
* }
|
||||
* ```
|
||||
*/
|
||||
export class UntarStream
|
||||
implements TransformStream<Uint8Array, TarStreamEntry> {
|
||||
#readable: ReadableStream<TarStreamEntry>;
|
||||
#writable: WritableStream<Uint8Array>;
|
||||
#gen: AsyncGenerator<Uint8Array>;
|
||||
#lock = false;
|
||||
constructor() {
|
||||
const { readable, writable } = new TransformStream<
|
||||
Uint8Array,
|
||||
Uint8Array
|
||||
>();
|
||||
this.#readable = ReadableStream.from(this.#untar());
|
||||
this.#writable = writable;
|
||||
|
||||
this.#gen = async function* () {
|
||||
const buffer: Uint8Array[] = [];
|
||||
for await (
|
||||
const chunk of readable.pipeThrough(new FixedChunkStream(512))
|
||||
) {
|
||||
if (chunk.length !== 512) {
|
||||
throw new RangeError(
|
||||
`Cannot extract the tar archive: The tarball chunk has an unexpected number of bytes (${chunk.length})`,
|
||||
);
|
||||
}
|
||||
|
||||
buffer.push(chunk);
|
||||
if (buffer.length > 2) yield buffer.shift()!;
|
||||
}
|
||||
if (buffer.length < 2) {
|
||||
throw new RangeError(
|
||||
"Cannot extract the tar archive: The tarball is too small to be valid",
|
||||
);
|
||||
}
|
||||
if (!buffer.every((value) => value.every((x) => x === 0))) {
|
||||
throw new TypeError(
|
||||
"Cannot extract the tar archive: The tarball has invalid ending",
|
||||
);
|
||||
}
|
||||
}();
|
||||
}
|
||||
|
||||
async *#untar(): AsyncGenerator<TarStreamEntry> {
|
||||
const decoder = new TextDecoder();
|
||||
while (true) {
|
||||
while (this.#lock) {
|
||||
await new Promise((resolve) => setTimeout(resolve, 0));
|
||||
}
|
||||
|
||||
const { done, value } = await this.#gen.next();
|
||||
if (done) break;
|
||||
|
||||
// Validate Checksum
|
||||
const checksum = parseInt(
|
||||
decoder.decode(value.subarray(148, 156 - 2)),
|
||||
8,
|
||||
);
|
||||
value.fill(32, 148, 156);
|
||||
if (value.reduce((x, y) => x + y) !== checksum) {
|
||||
throw new SyntaxError(
|
||||
"Cannot extract the tar archive: An archive entry has invalid header checksum",
|
||||
);
|
||||
}
|
||||
|
||||
// Decode Header
|
||||
let header: OldStyleFormat | PosixUstarFormat = {
|
||||
name: decoder.decode(value.subarray(0, 100)).split("\0")[0]!,
|
||||
mode: parseInt(decoder.decode(value.subarray(100, 108 - 2))),
|
||||
uid: parseInt(decoder.decode(value.subarray(108, 116 - 2))),
|
||||
gid: parseInt(decoder.decode(value.subarray(116, 124 - 2))),
|
||||
size: parseInt(decoder.decode(value.subarray(124, 136)).trimEnd(), 8),
|
||||
mtime: parseInt(decoder.decode(value.subarray(136, 148 - 1)), 8),
|
||||
typeflag: decoder.decode(value.subarray(156, 157)),
|
||||
linkname: decoder.decode(value.subarray(157, 257)).split("\0")[0]!,
|
||||
};
|
||||
if (header.typeflag === "\0") header.typeflag = "0";
|
||||
// "ustar\u000000"
|
||||
if (
|
||||
[117, 115, 116, 97, 114, 0, 48, 48].every((byte, i) =>
|
||||
value[i + 257] === byte
|
||||
)
|
||||
) {
|
||||
header = {
|
||||
...header,
|
||||
magic: decoder.decode(value.subarray(257, 263)),
|
||||
version: decoder.decode(value.subarray(263, 265)),
|
||||
uname: decoder.decode(value.subarray(265, 297)).split("\0")[0]!,
|
||||
gname: decoder.decode(value.subarray(297, 329)).split("\0")[0]!,
|
||||
devmajor: decoder.decode(value.subarray(329, 337)).replaceAll(
|
||||
"\0",
|
||||
"",
|
||||
),
|
||||
devminor: decoder.decode(value.subarray(337, 345)).replaceAll(
|
||||
"\0",
|
||||
"",
|
||||
),
|
||||
prefix: decoder.decode(value.subarray(345, 500)).split("\0")[0]!,
|
||||
};
|
||||
}
|
||||
|
||||
yield {
|
||||
path: ("prefix" in header && header.prefix.length
|
||||
? header.prefix + "/"
|
||||
: "") + header.name,
|
||||
header,
|
||||
readable: ["1", "2", "3", "4", "5", "6"].includes(header.typeflag)
|
||||
? undefined
|
||||
: this.#readableFile(header.size),
|
||||
};
|
||||
}
|
||||
}
|
||||
|
||||
async *#genFile(size: number): AsyncGenerator<Uint8Array> {
|
||||
for (let i = Math.ceil(size / 512); i > 0; --i) {
|
||||
const { done, value } = await this.#gen.next();
|
||||
if (done) {
|
||||
throw new SyntaxError(
|
||||
"Cannot extract the tar archive: Unexpected end of Tarball",
|
||||
);
|
||||
}
|
||||
if (i === 1 && size % 512) yield value.subarray(0, size % 512);
|
||||
else yield value;
|
||||
}
|
||||
}
|
||||
|
||||
#readableFile(size: number): ReadableStream<Uint8Array> {
|
||||
this.#lock = true;
|
||||
const releaseLock = () => this.#lock = false;
|
||||
const gen = this.#genFile(size);
|
||||
return new ReadableStream({
|
||||
type: "bytes",
|
||||
async pull(controller) {
|
||||
const { done, value } = await gen.next();
|
||||
if (done) {
|
||||
releaseLock();
|
||||
controller.close();
|
||||
return controller.byobRequest?.respond(0);
|
||||
}
|
||||
if (controller.byobRequest?.view) {
|
||||
const buffer = new Uint8Array(controller.byobRequest.view.buffer);
|
||||
|
||||
const size = buffer.length;
|
||||
if (size < value.length) {
|
||||
buffer.set(value.slice(0, size));
|
||||
controller.byobRequest.respond(size);
|
||||
controller.enqueue(value.slice(size));
|
||||
} else {
|
||||
buffer.set(value);
|
||||
controller.byobRequest.respond(value.length);
|
||||
}
|
||||
} else {
|
||||
controller.enqueue(value);
|
||||
}
|
||||
},
|
||||
async cancel() {
|
||||
// deno-lint-ignore no-empty
|
||||
for await (const _ of gen) {}
|
||||
releaseLock();
|
||||
},
|
||||
});
|
||||
}
|
||||
|
||||
/**
|
||||
* The ReadableStream
|
||||
*
|
||||
* @return ReadableStream<TarStreamChunk>
|
||||
*
|
||||
* @example Usage
|
||||
* ```ts no-eval
|
||||
* import { UntarStream } from "@std/archive/untar-stream";
|
||||
* import { dirname, normalize } from "@std/path";
|
||||
*
|
||||
* for await (
|
||||
* const entry of (await Deno.open("./out.tar.gz"))
|
||||
* .readable
|
||||
* .pipeThrough(new DecompressionStream("gzip"))
|
||||
* .pipeThrough(new UntarStream())
|
||||
* ) {
|
||||
* const path = normalize(entry.path);
|
||||
* await Deno.mkdir(dirname(path));
|
||||
* await entry.readable?.pipeTo((await Deno.create(path)).writable);
|
||||
* }
|
||||
* ```
|
||||
*/
|
||||
get readable(): ReadableStream<TarStreamEntry> {
|
||||
return this.#readable;
|
||||
}
|
||||
|
||||
/**
|
||||
* The WritableStream
|
||||
*
|
||||
* @return WritableStream<Uint8Array>
|
||||
*
|
||||
* @example Usage
|
||||
* ```ts no-eval
|
||||
* import { UntarStream } from "@std/archive/untar-stream";
|
||||
* import { dirname, normalize } from "@std/path";
|
||||
*
|
||||
* for await (
|
||||
* const entry of (await Deno.open("./out.tar.gz"))
|
||||
* .readable
|
||||
* .pipeThrough(new DecompressionStream("gzip"))
|
||||
* .pipeThrough(new UntarStream())
|
||||
* ) {
|
||||
* const path = normalize(entry.path);
|
||||
* await Deno.mkdir(dirname(path));
|
||||
* await entry.readable?.pipeTo((await Deno.create(path)).writable);
|
||||
* }
|
||||
* ```
|
||||
*/
|
||||
get writable(): WritableStream<Uint8Array> {
|
||||
return this.#writable;
|
||||
}
|
||||
}
|
250
archive/untar_stream_test.ts
Normal file
250
archive/untar_stream_test.ts
Normal file
@ -0,0 +1,250 @@
|
||||
// Copyright 2018-2024 the Deno authors. All rights reserved. MIT license.
|
||||
import { concat } from "../bytes/mod.ts";
|
||||
import { TarStream, type TarStreamInput } from "./tar_stream.ts";
|
||||
import {
|
||||
type OldStyleFormat,
|
||||
type PosixUstarFormat,
|
||||
UntarStream,
|
||||
} from "./untar_stream.ts";
|
||||
import { assertEquals, assertRejects } from "../assert/mod.ts";
|
||||
|
||||
Deno.test("expandTarArchiveCheckingHeaders", async () => {
|
||||
const text = new TextEncoder().encode("Hello World!");
|
||||
const seconds = Math.floor(Date.now() / 1000);
|
||||
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "directory",
|
||||
path: "./potato",
|
||||
options: {
|
||||
mode: 111111,
|
||||
uid: 12,
|
||||
gid: 21,
|
||||
mtime: seconds,
|
||||
uname: "potato",
|
||||
gname: "cake",
|
||||
devmajor: "ice",
|
||||
devminor: "scream",
|
||||
},
|
||||
},
|
||||
{
|
||||
type: "file",
|
||||
path: "./text.txt",
|
||||
size: text.length,
|
||||
readable: ReadableStream.from([text.slice()]),
|
||||
options: { mtime: seconds },
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream())
|
||||
.pipeThrough(new UntarStream());
|
||||
|
||||
const headers: (OldStyleFormat | PosixUstarFormat)[] = [];
|
||||
for await (const item of readable) {
|
||||
headers.push(item.header);
|
||||
await item.readable?.cancel();
|
||||
}
|
||||
assertEquals(headers, [{
|
||||
name: "./potato",
|
||||
mode: 111111,
|
||||
uid: 12,
|
||||
gid: 21,
|
||||
mtime: seconds,
|
||||
uname: "potato",
|
||||
gname: "cake",
|
||||
devmajor: "ice",
|
||||
devminor: "scream",
|
||||
size: 0,
|
||||
typeflag: "5",
|
||||
linkname: "",
|
||||
magic: "ustar\0",
|
||||
version: "00",
|
||||
prefix: "",
|
||||
}, {
|
||||
name: "./text.txt",
|
||||
mode: 644,
|
||||
uid: 0,
|
||||
gid: 0,
|
||||
mtime: seconds,
|
||||
uname: "",
|
||||
gname: "",
|
||||
devmajor: "",
|
||||
devminor: "",
|
||||
size: text.length,
|
||||
typeflag: "0",
|
||||
linkname: "",
|
||||
magic: "ustar\0",
|
||||
version: "00",
|
||||
prefix: "",
|
||||
}]);
|
||||
});
|
||||
|
||||
Deno.test("expandTarArchiveCheckingBodies", async () => {
|
||||
const text = new TextEncoder().encode("Hello World!");
|
||||
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "directory",
|
||||
path: "./potato",
|
||||
},
|
||||
{
|
||||
type: "file",
|
||||
path: "./text.txt",
|
||||
size: text.length,
|
||||
readable: ReadableStream.from([text.slice()]),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream())
|
||||
.pipeThrough(new UntarStream());
|
||||
|
||||
let buffer = new Uint8Array();
|
||||
for await (const item of readable) {
|
||||
if (item.readable) {
|
||||
buffer = concat(await Array.fromAsync(item.readable));
|
||||
}
|
||||
}
|
||||
assertEquals(buffer, text);
|
||||
});
|
||||
|
||||
Deno.test("UntarStream() with size equals to multiple of 512", async () => {
|
||||
const size = 512 * 3;
|
||||
const data = Uint8Array.from(
|
||||
{ length: size },
|
||||
() => Math.floor(Math.random() * 256),
|
||||
);
|
||||
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "file",
|
||||
path: "name",
|
||||
size,
|
||||
readable: ReadableStream.from([data.slice()]),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream())
|
||||
.pipeThrough(new UntarStream());
|
||||
|
||||
let buffer = new Uint8Array();
|
||||
for await (const entry of readable) {
|
||||
if (entry.readable) {
|
||||
buffer = concat(await Array.fromAsync(entry.readable));
|
||||
}
|
||||
}
|
||||
assertEquals(buffer, data);
|
||||
});
|
||||
|
||||
Deno.test("UntarStream() with invalid size", async () => {
|
||||
const readable = ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "file",
|
||||
path: "newFile.txt",
|
||||
size: 512,
|
||||
readable: ReadableStream.from([new Uint8Array(512).fill(97)]),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream())
|
||||
.pipeThrough(
|
||||
new TransformStream<Uint8Array, Uint8Array>({
|
||||
flush(controller) {
|
||||
controller.enqueue(new Uint8Array(100));
|
||||
},
|
||||
}),
|
||||
)
|
||||
.pipeThrough(new UntarStream());
|
||||
|
||||
await assertRejects(
|
||||
async () => {
|
||||
for await (const entry of readable) {
|
||||
if (entry.readable) {
|
||||
// deno-lint-ignore no-empty
|
||||
for await (const _ of entry.readable) {}
|
||||
}
|
||||
}
|
||||
},
|
||||
RangeError,
|
||||
"Cannot extract the tar archive: The tarball chunk has an unexpected number of bytes (100)",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("UntarStream() with invalid ending", async () => {
|
||||
const tarBytes = concat(
|
||||
await Array.fromAsync(
|
||||
ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "file",
|
||||
path: "newFile.txt",
|
||||
size: 512,
|
||||
readable: ReadableStream.from([new Uint8Array(512).fill(97)]),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream()),
|
||||
),
|
||||
);
|
||||
tarBytes[tarBytes.length - 1] = 1;
|
||||
|
||||
const readable = ReadableStream.from([tarBytes])
|
||||
.pipeThrough(new UntarStream());
|
||||
|
||||
await assertRejects(
|
||||
async () => {
|
||||
for await (const entry of readable) {
|
||||
if (entry.readable) {
|
||||
// deno-lint-ignore no-empty
|
||||
for await (const _ of entry.readable) {}
|
||||
}
|
||||
}
|
||||
},
|
||||
TypeError,
|
||||
"Cannot extract the tar archive: The tarball has invalid ending",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("UntarStream() with too small size", async () => {
|
||||
const readable = ReadableStream.from([new Uint8Array(512)])
|
||||
.pipeThrough(new UntarStream());
|
||||
|
||||
await assertRejects(
|
||||
async () => {
|
||||
for await (const entry of readable) {
|
||||
if (entry.readable) {
|
||||
// deno-lint-ignore no-empty
|
||||
for await (const _ of entry.readable) {}
|
||||
}
|
||||
}
|
||||
},
|
||||
RangeError,
|
||||
"Cannot extract the tar archive: The tarball is too small to be valid",
|
||||
);
|
||||
});
|
||||
|
||||
Deno.test("UntarStream() with invalid checksum", async () => {
|
||||
const tarBytes = concat(
|
||||
await Array.fromAsync(
|
||||
ReadableStream.from<TarStreamInput>([
|
||||
{
|
||||
type: "file",
|
||||
path: "newFile.txt",
|
||||
size: 512,
|
||||
readable: ReadableStream.from([new Uint8Array(512).fill(97)]),
|
||||
},
|
||||
])
|
||||
.pipeThrough(new TarStream()),
|
||||
),
|
||||
);
|
||||
tarBytes[148] = 97;
|
||||
|
||||
const readable = ReadableStream.from([tarBytes])
|
||||
.pipeThrough(new UntarStream());
|
||||
|
||||
await assertRejects(
|
||||
async () => {
|
||||
for await (const entry of readable) {
|
||||
if (entry.readable) {
|
||||
// deno-lint-ignore no-empty
|
||||
for await (const _ of entry.readable) {}
|
||||
}
|
||||
}
|
||||
},
|
||||
Error,
|
||||
"Cannot extract the tar archive: An archive entry has invalid header checksum",
|
||||
);
|
||||
});
|
Loading…
Reference in New Issue
Block a user