Project Category: Individual Project (Personal Productivity Tool)
High-performance file discovery tool engineered for rapid filesystem analysis on Windows. Leverages .NET Framework APIs and advanced data structures to identify the largest files across entire drives with performance exceeding standard PowerShell cmdlets by 16-50x.
Tested on: 16-core CPU, 32GB RAM, 1.82TB HDD
Scan Target | Files Discovered | Execution Time | Throughput |
---|---|---|---|
Complete C:\ drive (media filter) | 469,385 | 151 seconds | 3,107 files/sec |
User Downloads folder (video filter) | 76 | 2.1 seconds | 36 files/sec |
User Music folder (audio filter) | 621 | 0.07 seconds | 8,821 files/sec |
Method | Time to Scan C:\ | Throughput | Performance |
---|---|---|---|
File Scanner (.NET APIs) | 151 seconds | 3,107 files/sec | Baseline (1x) |
PowerShell Get-ChildItem | 2,500+ seconds | ~188 files/sec | 16-50x slower |
Get-ChildItem with filters | 1,800+ seconds | ~260 files/sec | 12x slower |
Implements a min-heap via .NET's SortedSet<T>
to maintain the top N largest files.
Time complexity: O(M log K) where M = total files, K = top files to track
Space complexity: O(K) - constant regardless of total file count
$minHeap = [System.Collections.Generic.SortedSet[object]]::new(
[System.Collections.Generic.Comparer[object]]::Create({
param($a, $b)
$a.Length.CompareTo($b.Length) # Ascending order
})
)
# Insert if heap not full or file larger than minimum
if ($minHeap.Count -lt $TopCount) {
$minHeap.Add($fileInfo)
} elseif ($fileInfo.Length -gt $minHeap.Min.Length) {
$minHeap.Remove($minHeap.Min)
$minHeap.Add($fileInfo)
}
Directory traversal uses FIFO queue for BFS instead of recursive DFS.
Advantages:
• Prevents stack overflow (Windows can have 10,000+ nested directories)
• Better cache locality (processes all files in directory before moving)
• Predictable memory usage: O(D) where D = directory count
$queue = [System.Collections.Generic.Queue[string]]::new()
$queue.Enqueue($RootPath)
while ($queue.Count -gt 0) {
$currentPath = $queue.Dequeue()
foreach ($dir in [System.IO.Directory]::EnumerateDirectories($currentPath)) {
$queue.Enqueue($dir)
}
foreach ($file in [System.IO.Directory]::EnumerateFiles($currentPath)) {
# Process file
}
}
# Scan C:\ drive for top 73 largest media files
.\file_scanner.ps1 -RootPath "C:\" -TopCount 73 -FileTypes @(".mp4",".mkv",".avi")
# Scan Downloads folder for videos
.\file_scanner.ps1 -RootPath "$env:USERPROFILE\Downloads" -FileTypes @(".mp4",".mov")
# Quick scan of Pictures folder
.\file_scanner.ps1 -RootPath "$env:USERPROFILE\Pictures" -TopCount 20