Skip to content

Conversation

@alinpahontu2912
Copy link
Member

Fix ZipArchive in Update mode rewriting when no changes made

Opening an entry stream in ZipArchiveMode.Update would immediately mark the archive as modified, causing Dispose to attempt a rewrite even when no actual writes occurred. This threw NotSupportedException on non-expandable streams like fixed-size MemoryStream.

The fix tracks actual writes via WrappedStream and only marks the entry as modified when data is written. Archives created on empty streams are marked as new to ensure they're still written on first creation.

Fixes #123419

@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @karelz, @dotnet/area-system-io-compression
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a regression where opening entry streams in ZipArchiveMode.Update could mark the archive as modified even when no data was written, causing unnecessary rewrites on Dispose (and failures on non-expandable streams).

Changes:

  • Track actual writes in Update mode by notifying the entry only on the first write to the opened stream.
  • Skip rewriting Update-mode archives on Dispose unless the archive is actually modified (while still writing newly-created empty archives).
  • Add regression tests covering Package/Zip scenarios where entries are read without writes.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/libraries/System.IO.Packaging/tests/Tests.cs Adds regression test ensuring Package.Open(ReadWrite) doesn’t throw on Dispose when only reading.
src/libraries/System.IO.Compression/tests/ZipArchive/zip_UpdateTests.cs Updates/extends Update-mode tests and adds regression tests for non-expandable streams.
src/libraries/System.IO.Compression/src/System/IO/Compression/ZipCustomStreams.cs Adds write-notification plumbing to mark entries modified only when written.
src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchiveEntry.cs Stops marking entries modified on open; introduces MarkAsModified; adjusts write path to reuse original compressed bytes if unchanged.
src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchiveEntry.Async.cs Async equivalents of Update-mode changes and unchanged-data handling.
src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchive.cs Adds Update-mode “write only if modified” behavior and “new archive” detection.
src/libraries/System.IO.Compression/src/System/IO/Compression/ZipArchive.Async.cs Async equivalents of Update-mode “write only if modified” and “new archive” detection.

Comment on lines +1356 to +1362
long originalLength = compressed.Length;

// Use a non-expandable MemoryStream (fixed buffer) - this is what Package.Open does internally
// This would throw NotSupportedException if Dispose tries to write/grow the stream
using MemoryStream stream = new MemoryStream(compressed);

// Open in Update mode (like Package.Open with FileAccess.ReadWrite)
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

originalLength is declared but never used, which can fail the build if warnings are treated as errors. Remove it or use it in an assertion (e.g., verify the backing stream length remains unchanged).

Copilot uses AI. Check for mistakes.
Comment on lines +1182 to +1193
// Check if the entry's stored data was actually modified (StoredData flag is set).
// If _storedUncompressedData is loaded but StoredData is not set, it means the entry
// was opened for update but no writes occurred - we should use the original compressed bytes.
bool storedDataModified = (Changes & ZipArchive.ChangeState.StoredData) != 0;

// If _storedUncompressedData is loaded but not modified, clear it so we use _compressedBytes
if (_storedUncompressedData != null && !storedDataModified)
{
_storedUncompressedData.Dispose();
_storedUncompressedData = null;
}

Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clearing _storedUncompressedData and falling back to _compressedBytes when StoredData isn't set can corrupt archives for entries whose metadata was mutated during GetUncompressedData*(). In particular, GetUncompressedData*() changes CompressionMethod to Deflate for non-Stored entries; if an entry was opened in Update mode (no writes) and later the archive is rewritten due to other changes, the header may be written with CompressionMethod=Deflate while the original _compressedBytes are still Deflate64, producing an invalid entry. To avoid mismatches, ensure metadata like CompressionMethod is not changed unless the entry is actually modified (e.g., defer normalization to MarkAsModified/first write), or restore the original method when discarding _storedUncompressedData.

Suggested change
// Check if the entry's stored data was actually modified (StoredData flag is set).
// If _storedUncompressedData is loaded but StoredData is not set, it means the entry
// was opened for update but no writes occurred - we should use the original compressed bytes.
bool storedDataModified = (Changes & ZipArchive.ChangeState.StoredData) != 0;
// If _storedUncompressedData is loaded but not modified, clear it so we use _compressedBytes
if (_storedUncompressedData != null && !storedDataModified)
{
_storedUncompressedData.Dispose();
_storedUncompressedData = null;
}

Copilot uses AI. Check for mistakes.
Comment on lines +354 to +365
// Check if the entry's stored data was actually modified (StoredData flag is set).
// If _storedUncompressedData is loaded but StoredData is not set, it means the entry
// was opened for update but no writes occurred - we should use the original compressed bytes.
bool storedDataModified = (Changes & ZipArchive.ChangeState.StoredData) != 0;

// If _storedUncompressedData is loaded but not modified, clear it so we use _compressedBytes
if (_storedUncompressedData != null && !storedDataModified)
{
await _storedUncompressedData.DisposeAsync().ConfigureAwait(false);
_storedUncompressedData = null;
}

Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same concern as the sync path: discarding _storedUncompressedData when StoredData isn't set can create header/data mismatches because GetUncompressedDataAsync() may have already changed metadata (e.g., CompressionMethod is normalized to Deflate). If the archive is later rewritten for other reasons, the entry may be written with updated metadata but original _compressedBytes. Ensure metadata normalization happens only when the entry is actually modified, or restore original metadata when reverting to _compressedBytes.

Suggested change
// Check if the entry's stored data was actually modified (StoredData flag is set).
// If _storedUncompressedData is loaded but StoredData is not set, it means the entry
// was opened for update but no writes occurred - we should use the original compressed bytes.
bool storedDataModified = (Changes & ZipArchive.ChangeState.StoredData) != 0;
// If _storedUncompressedData is loaded but not modified, clear it so we use _compressedBytes
if (_storedUncompressedData != null && !storedDataModified)
{
await _storedUncompressedData.DisposeAsync().ConfigureAwait(false);
_storedUncompressedData = null;
}

Copilot uses AI. Check for mistakes.
byte[] buffer = new byte[sampleEntryContents.Length + 1]; // +1 for the index byte added by CreateZipFile
int bytesRead = async
? await entryStream.ReadAsync(buffer)
: entryStream.Read(buffer, 0, buffer.Length);
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytesRead is assigned but never used. If warnings are treated as errors in this test project, this will break the build. Either remove the variable or assert on it (e.g., that some bytes were read) so the read has a checked outcome.

Suggested change
: entryStream.Read(buffer, 0, buffer.Length);
: entryStream.Read(buffer, 0, buffer.Length);
Assert.InRange(bytesRead, 1, buffer.Length);

Copilot uses AI. Check for mistakes.
private byte[] _archiveComment;
private Encoding? _entryNameAndCommentEncoding;
private long _firstDeletedEntryOffset;
private bool _isNewArchive; // true if archive was created on an empty stream
Copy link
Member

@stephentoub stephentoub Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? We can't determine whether the stream is empty by checking its Length when we're going to do the write?


// Archive-level changes (e.g., comment)
if (Changed != ChangeState.Unchanged)
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: please consistently use { }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Package.Open ZIP regression or unmitigatable breaking change on .NET 9

2 participants