Mailbox resorting attachments

Last Updated or created 2026-06-02

(I often forgot to download important attachments, like orders and important paperwork.

I am using paperless-ngx to make pdfs searchable.

  • export (takeout) mail from for example gmail
  • extract all attachments using below script
  • sort into media types
  • use dangerzone.rocks installation to sanitize (remove malware/virus)
cat All\ mail\ Including\ Spam\ and\ Trash-002.mbox | formail -des munpack

Cleanup script

find -type f -iname '*.desc' -exec rm  {} \;


for f in =X*; do
    new=$(echo "$f" | sed -E 's/^=X//; s/X=(\.[0-9]+)?$/\1/')
    mv -- "$f" "$new"
done

for f in *.[0-9]*; do
    base=${f%.[0-9]*}

    if [ -f "$base" ] && cmp -s -- "$base" "$f"; then
        echo "Removing duplicate: $f"
        rm -- "$f"
    fi
done

for f in *.[0-9]*; do
    n=${f##*.}          # 1, 2, 3 ...
    base=${f%.*}        # winmail.dat
    ext=${base##*.}     # dat
    name=${base%.*}     # winmail

    mv -- "$f" "${name}-${n}.${ext}"
done

for f in -*; do
    new="${f#-}"
    mv -- "$f" "$new"
done

for f in -*; do
    new="${f#-}"
    mv -- "$f" "$new"
done

for f in *X; do
    mv -- "$f" "${f%X}"
done


mkdir -p pdf images audio text movies bww zip midi html vcf xml
mv *PDF pdf 
mv *pdf pdf
mv *gif images
mv *GIF images
mv *jpg images
mv *bmp images
mv *BMP images
mv *jpeg images
mv *JPG images
mv *Jpg images
mv *png images
mv *tif images
mv *eps images
mv *PNG images
mv *Png images
mv *svg images
mv *psd images
mv *mp3 audio
mv *MP3 audio
mv *wma audio
mv *wav audio
mv *m4a audio
mv *txt text
mv *wri text
mv *doc text
mv *docx text
mv *xls text
mv *XLS text
mv *ppt text
mv *pptx text
mv *xlsx text
mv *mp4 movies
mv *MP4 movies
mv *avi movies
mv *mov movies
mv *MOV movies
mv *mpg movies
mv *MPG movies
mv *bww bww
mv *abc bww
mv *pio bww
mv *zip zip
mv *ZIP zip
mv *tgz zip
mv *tar zip
mv *rar zip
mv *mid midi
mv *html html
mv *htm html
mv *vcf vcf
mv *xml xml

Next thing to do .. sanitize PDF’s

paperless-ngx to ingest

Spread the love

Leave a Reply

Your email address will not be published. Required fields are marked *

Are you human? Please solve:Captcha