nip/docs/remote-repository-specifica...

512 lines
15 KiB
Markdown

# NimPak Remote Repository and Binary Cache Specification
## Overview
The NimPak Remote Repository and Binary Cache system enables distributed package distribution with cryptographic verification, efficient synchronization, and intelligent binary cache selection. This system builds on the security foundation to provide lightning-fast installs with military-grade integrity guarantees.
## Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ Remote Repository Network │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │
│ │ Repository │ │ Binary Cache │ │ Mirror │ │
│ │ Server │ │ Server │ │ Network │ │
│ │ │ │ │ │ │ │
│ │ • Metadata │ │ • Binary Store │ │ • Sync │ │
│ │ • Manifests │ │ • Compatibility │ │ • Replicate │ │
│ │ • Signatures │ │ • Auto-select │ │ • Load Bal │ │
│ │ • Trust Scores │ │ • Verification │ │ • Failover │ │
│ └─────────────────┘ └─────────────────┘ └──────────────┘ │
├─────────────────────────────────────────────────────────────┤
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │
│ │ Remote Manager │ │ Cache Manager │ │ Sync Engine │ │
│ │ │ │ │ │ │ │
│ │ • Fetch/Push │ │ • Hit/Miss │ │ • Delta Sync │ │
│ │ • Auth/Trust │ │ • Compatibility │ │ • Bloom Filt │ │
│ │ • Retry Logic │ │ • Eviction │ │ • Bandwidth │ │
│ │ • Load Balance │ │ • Statistics │ │ • Integrity │ │
│ └─────────────────┘ └─────────────────┘ └──────────────┘ │
└─────────────────────────────────────────────────────────────┘
```
## Core Components
### 1. Repository Server
The repository server hosts package metadata, manifests, and trust information.
**Features:**
- **Signed Manifests**: All metadata cryptographically signed with Ed25519
- **Trust Propagation**: Cross-repository trust score sharing
- **Delta Uploads**: Efficient CAS-based synchronization
- **Policy Enforcement**: Server-side trust policy validation
- **Audit Trails**: Complete operation logging for compliance
**API Endpoints:**
```
GET /api/v1/packages # List packages
GET /api/v1/packages/{name} # Package metadata
GET /api/v1/packages/{name}/{version} # Version-specific data
GET /api/v1/manifests/{hash} # Package manifest
POST /api/v1/packages # Upload package
PUT /api/v1/trust/{actor} # Update trust score
GET /api/v1/sync/changes # Incremental sync
```
### 2. Binary Cache Server
The binary cache server provides pre-compiled binaries with compatibility matching.
**Features:**
- **Compatibility Detection**: CPU flags, libc, allocator, architecture
- **Automatic Selection**: Best binary match for target system
- **Fallback Logic**: Source build when no compatible binary exists
- **Verification**: Every binary cryptographically verified
- **Statistics**: Cache hit rates and performance metrics
**Cache Key Format:**
```
{package}-{version}-{arch}-{libc}-{allocator}-{cpu_flags}-{build_hash}
```
### 3. Mirror Network
The mirror network provides global distribution with intelligent routing.
**Features:**
- **Geographic Distribution**: Mirrors in multiple regions
- **Load Balancing**: Intelligent routing based on latency and load
- **Failover**: Automatic failover to healthy mirrors
- **Synchronization**: Real-time sync with integrity verification
- **Bandwidth Optimization**: Delta sync and compression
## Protocol Specifications
### 1. Repository Manifest Format
```kdl
repository_manifest {
version "1.0"
repository_id "nexusos-stable"
timestamp "2025-01-15T10:30:00Z"
signature {
algorithm "ed25519"
key_id "nexusos-repo-2025"
signature "base64-encoded-signature"
}
packages {
htop {
version "3.2.2"
hash "blake3-abc123..."
trust_score 0.95
binaries {
x86_64_musl_jemalloc "blake3-binary1..."
x86_64_glibc_default "blake3-binary2..."
}
}
}
trust_policies {
minimum_score 0.7
require_signatures true
allowed_sources "original" "grafted"
}
}
```
### 2. Binary Cache Metadata
```kdl
binary_cache_entry {
package_id "htop"
version "3.2.2"
build_hash "blake3-build456..."
compatibility {
architecture "x86_64"
libc "musl-1.2.4"
allocator "jemalloc-5.3.0"
cpu_features "sse4.2" "avx2"
abi_version "1.0"
}
binary {
hash "blake3-binary789..."
size 2048576
compression "zstd"
signature {
algorithm "ed25519"
key_id "build-farm-2025"
signature "base64-signature"
}
}
build_info {
builder "nexusos-build-farm-01"
build_time "2025-01-15T08:00:00Z"
compiler_version "nim-2.0.0"
build_flags "--opt:speed" "--cpu:native"
}
}
```
### 3. Sync Protocol
```kdl
sync_request {
client_id "nimpak-client-uuid"
last_sync "2025-01-15T09:00:00Z"
bloom_filter "base64-encoded-bloom"
capabilities {
delta_sync true
compression "zstd"
max_bandwidth "10MB/s"
}
}
sync_response {
changes {
added {
htop "3.2.3" "blake3-new..."
}
updated {
vim "9.0.3" "blake3-updated..."
}
removed {
old_package "1.0.0"
}
}
delta_objects {
"blake3-delta1..." {
base "blake3-base..."
patch "base64-patch..."
}
}
next_sync_token "sync-token-123"
}
```
## Implementation Plan
### Phase 1: Core Remote Manager
**Files to Create:**
- `src/nimpak/remote/manager.nim` - Core remote repository management
- `src/nimpak/remote/client.nim` - HTTP client with retry logic
- `src/nimpak/remote/auth.nim` - Authentication and authorization
- `src/nimpak/remote/manifest.nim` - Manifest parsing and validation
**Key Functions:**
```nim
proc addRepository*(url: string, keyId: string): RemoteResult[Repository]
proc fetchPackageList*(repo: Repository): RemoteResult[seq[PackageInfo]]
proc downloadPackage*(repo: Repository, packageId: string, version: string): RemoteResult[PackageData]
proc uploadPackage*(repo: Repository, package: PackageData): RemoteResult[UploadResult]
proc verifyRepositorySignature*(manifest: RepositoryManifest): RemoteResult[bool]
```
### Phase 2: Binary Cache System
**Files to Create:**
- `src/nimpak/cache/manager.nim` - Binary cache management
- `src/nimpak/cache/compatibility.nim` - Binary compatibility detection
- `src/nimpak/cache/selection.nim` - Optimal binary selection
- `src/nimpak/cache/statistics.nim` - Cache performance metrics
**Key Functions:**
```nim
proc findCompatibleBinary*(packageId: string, version: string, targetSystem: SystemInfo): CacheResult[BinaryInfo]
proc cacheBinary*(binary: BinaryData, metadata: BinaryMetadata): CacheResult[CacheKey]
proc getCacheStatistics*(): CacheStatistics
proc evictOldBinaries*(policy: EvictionPolicy): CacheResult[int]
```
### Phase 3: Synchronization Engine
**Files to Create:**
- `src/nimpak/sync/engine.nim` - Synchronization engine
- `src/nimpak/sync/delta.nim` - Delta synchronization
- `src/nimpak/sync/bloom.nim` - Bloom filter implementation
- `src/nimpak/sync/bandwidth.nim` - Bandwidth management
**Key Functions:**
```nim
proc syncRepository*(repo: Repository, lastSync: DateTime): SyncResult[SyncSummary]
proc createDeltaSync*(source: CASObject, target: CASObject): SyncResult[DeltaPatch]
proc applyDeltaSync*(base: CASObject, patch: DeltaPatch): SyncResult[CASObject]
proc optimizeBandwidth*(transfers: seq[Transfer], maxBandwidth: int): seq[Transfer]
```
### Phase 4: CLI Integration
**Enhanced Commands:**
```bash
# Repository management
nip repo add <url> [--key-id <id>]
nip repo list
nip repo remove <name>
nip repo sync [--repo <name>]
# Package operations with remote support
nip install <package> [--repo <name>] [--prefer-binary]
nip search <query> [--repo <name>]
nip publish <package> [--repo <name>]
# Cache management
nip cache status
nip cache clean [--max-size <size>]
nip cache stats
# Mirror management
nip mirror add <url>
nip mirror list
nip mirror sync
```
## Security Integration
### 1. Trust Verification
Every remote operation integrates with the trust system:
```nim
proc verifyRemotePackage*(package: RemotePackage): TrustResult =
# 1. Verify repository signature
let repoTrust = verifyRepositorySignature(package.repository.manifest)
# 2. Verify package signature
let packageTrust = verifyPackageSignature(package.signature)
# 3. Check trust policy
let policyResult = evaluatePackageTrust(trustManager, package.provenance)
# 4. Calculate combined trust score
return combineTrustResults(repoTrust, packageTrust, policyResult)
```
### 2. Secure Communication
All network communication uses TLS with certificate pinning:
```nim
proc createSecureClient*(repo: Repository): HttpClient =
var client = newHttpClient()
client.sslContext = newContext(verifyMode = CVerifyPeer)
# Pin repository certificate
if repo.certificatePin.isSome():
client.sslContext.pinCertificate(repo.certificatePin.get())
return client
```
### 3. Integrity Verification
Every downloaded object is verified:
```nim
proc downloadWithVerification*(url: string, expectedHash: string): DownloadResult[seq[byte]] =
let data = await httpClient.downloadData(url)
# Verify hash
let actualHash = computeHash(data, HashBlake3)
if actualHash != expectedHash:
return error("Hash verification failed")
# Log security event
logGlobalSecurityEvent(EventPackageVerification, SeverityInfo, "remote-download",
fmt"Package downloaded and verified: {url}")
return success(data)
```
## Performance Optimizations
### 1. Parallel Downloads
```nim
proc downloadPackagesParallel*(packages: seq[PackageRequest]): seq[DownloadResult] =
var futures: seq[Future[DownloadResult]] = @[]
for package in packages:
futures.add(downloadPackageAsync(package))
return waitFor all(futures)
```
### 2. Compression and Caching
```nim
proc downloadWithCompression*(url: string): DownloadResult[seq[byte]] =
var client = newHttpClient()
client.headers["Accept-Encoding"] = "zstd, gzip"
let response = await client.get(url)
let data = decompressData(response.body, response.headers["Content-Encoding"])
return success(data)
```
### 3. Bandwidth Management
```nim
proc manageBandwidth*(transfers: var seq[Transfer], maxBandwidth: int) =
var currentBandwidth = 0
for transfer in transfers.mitems:
if currentBandwidth + transfer.estimatedBandwidth <= maxBandwidth:
transfer.start()
currentBandwidth += transfer.estimatedBandwidth
else:
transfer.queue()
```
## Configuration
### Repository Configuration (`nip-repositories.kdl`)
```kdl
repositories {
version "1.0"
nexusos_stable {
url "https://packages.nexusos.org/stable"
key_id "nexusos-stable-2025"
priority 100
enabled true
trust_policy {
minimum_score 0.8
require_signatures true
allow_grafted false
}
cache {
enabled true
max_size "10GB"
ttl 86400 # 24 hours
}
}
community {
url "https://community.nexusos.org/packages"
key_id "nexusos-community-2025"
priority 50
enabled true
trust_policy {
minimum_score 0.6
require_signatures false
allow_grafted true
}
}
}
```
### Cache Configuration (`nip-cache.kdl`)
```kdl
cache {
version "1.0"
binary_cache {
enabled true
location "/var/cache/nimpak/binaries"
max_size "50GB"
eviction_policy {
strategy "lru" # lru, lfu, size
check_interval 3600 # 1 hour
min_free_space "5GB"
}
compatibility {
strict_matching false
allow_fallback true
prefer_native true
}
}
source_cache {
enabled true
location "/var/cache/nimpak/sources"
max_size "10GB"
ttl 604800 # 1 week
}
}
```
## Monitoring and Metrics
### Performance Metrics
```nim
type
RemoteMetrics* = object
downloadCount*: int64
downloadBytes*: int64
downloadTime*: float
cacheHitRate*: float
averageLatency*: float
errorRate*: float
CacheMetrics* = object
hitCount*: int64
missCount*: int64
evictionCount*: int64
storageUsed*: int64
storageLimit*: int64
```
### Health Checks
```nim
proc checkRepositoryHealth*(repo: Repository): HealthResult =
# Check connectivity
let pingResult = pingRepository(repo)
if not pingResult.success:
return unhealthy("Repository unreachable")
# Check certificate validity
let certResult = verifyCertificate(repo)
if not certResult.valid:
return unhealthy("Invalid certificate")
# Check trust status
let trustResult = verifyRepositoryTrust(repo)
if trustResult.score < 0.5:
return warning("Low trust score")
return healthy("Repository operational")
```
## Future Enhancements
### 1. Content Delivery Network (CDN)
- **Global Distribution**: CDN integration for worldwide package distribution
- **Edge Caching**: Cache popular packages at edge locations
- **Intelligent Routing**: Route requests to nearest healthy edge
### 2. Peer-to-Peer Distribution
- **P2P Protocol**: BitTorrent-like protocol for package distribution
- **Swarm Intelligence**: Coordinate downloads across multiple peers
- **Bandwidth Sharing**: Share bandwidth costs across community
### 3. Advanced Caching
- **Predictive Caching**: ML-based prediction of package needs
- **Collaborative Filtering**: Share cache decisions across similar systems
- **Adaptive Policies**: Dynamic cache policies based on usage patterns
---
This specification provides the foundation for a world-class distributed package distribution system that builds on NimPak's security foundation to deliver lightning-fast, verified package installations with military-grade integrity guarantees.