It's A Digital Disease!

23 readers
1 users here now

This is a sub that aims at bringing data hoarders together to share their passion with like minded people.

founded 2 years ago
MODERATORS
7876
 
 
The original post: /r/datahoarder by /u/Advance_Shot on 2024-07-28 14:55:37.

Can I do it? after I login I tried with Video DownloadHepler and it doesnt work, it shows the video in the list but it doesnt start

7877
 
 
The original post: /r/datahoarder by /u/reaperofpower on 2024-07-28 18:00:44.

Does anyone record the live streams for the sets at tomorrow land?

7878
 
 
The original post: /r/datahoarder by /u/geekman20 on 2024-07-28 17:25:44.
7879
 
 
The original post: /r/datahoarder by /u/AlexFireFox on 2024-07-28 17:02:54.

There are so many different methods of storing data but I just don't know which to use. I was thinking have an iCloud backup and a physical backup but I don't know what to do for the physical one. What is the most reliable way to store less than 2TB of data without breaking the bank? TIA

Edit: I will be storing photos and videos for memories and preservation. Not sure if that affects the suggestions but that's the use.

7880
 
 
The original post: /r/datahoarder by /u/scardemon on 2024-07-28 15:45:45.

I mainly use it for tormenting but I heard this product is prone to failure early. Used heavily for seeding and downloading a lot of files. Was wondering what the best way to use it or a replacement to use as a backup for at least 3 to 5 years. I understand that drives don't last forever but what is the best practice for long term storage as well as heavy use for torrenting.

7881
 
 
The original post: /r/datahoarder by /u/Youngqueazy on 2024-07-28 15:08:54.

Hey everyone, I’m going to be moving my NAS to a new address in the next few weeks but I’m having a difficult time figuring out the best solution to keep my drives protected.

I’ve seen things like the individual Orico cases and pelican cases.

I have 8 drives. What would you guys recommend?

7882
 
 
The original post: /r/datahoarder by /u/scphantm on 2024-07-28 14:39:37.

I have a large 45 drive storage array managed by ms storage spaces on win10. I have drives ranging crime 1tb all the way to 20tb. They are configured in a full mirror. I would say I’m satisfied with the setup so far but it’s time to upgrade the OS. My choices are win server or Linux. I MUCH prefer Linux but I am having a hard time about the drive array. I need some kind of management system that allows me to reassemble my varying make and model and size drives into a fault tolerant single virtual drive. It has to be able to handle drive failures and recovery easily as some of my drives are many many years beyond their prime. This has been storage spaces biggest advantage.

Oh, and no striping. The array is about 160tb avail, 320 in space. I already know I have to transfer the files to a new array whatever I do cuz I wanna get off ntfs, so now is the time if I can find a replacement for storage spaces

Any suggestions?

7883
 
 
The original post: /r/datahoarder by /u/Ancient_Purchase4816 on 2024-07-28 14:33:39.

Hi all,

I bought some refurbished WD disks.

In the recording there are three disks in a RAIDZ1 configuration and I am copying data over the network at 85 MB/s.

https://sndup.net/chz3b/

The reason I am concerned is because these are my first high capacity disks and NAS, and since they are refurbished I would like to know if my data and investment are safe.

Are these sounds ok?

7884
 
 
The original post: /r/datahoarder by /u/Squid1917 on 2024-07-28 14:17:31.

Im planning on getting a jonsbo n3 nas case. With 8* 20tb exos with rtx 4060. I need help on finding a motherboard. Since the case is m-itx, it limits this alot. My current idea is a b550 phantom gaming-itx ax. As it has 4 sata and 2 m.2 (going to use a m.2 to sata adapter). Is this what you professionals would recommend or something diffferent. Not wanting to break the bank max of 250.

Squid

7885
 
 
The original post: /r/datahoarder by /u/AyneHancer on 2024-07-28 14:16:55.

.Mhtml is not ideal for pixel perfect reproduction of original webpages, so I wonder if there is any other file format to do it?

7886
 
 
The original post: /r/datahoarder by /u/restlessmonkey on 2024-07-28 12:27:19.

I’ve never heard of them. Hoping others have. Worthy of getting with NAS???

https://www.stacksocial.com/sales/folderfort-1tb-storage-pro-plan-lifetime-subscription

7887
 
 
The original post: /r/datahoarder by /u/Dron22 on 2024-07-28 07:12:50.

This has me worried because I have a Samsung external SSD and a couple of cheaper SSDs that I occasionally left disconnected in a drawer for 6 months or more.

I also have a laptop from 2018 that I don't use for months, it's battery would deplete in a month. It has its OS on a 256 GB M2 SSD, and it's drive D is an SSHD. I don't think I noticed any obvious problems with it.

I also have multiple regular USB flash drives, some of which are over 10 years old and rarely used. Could they lose data too or become corrupted?

7888
 
 
The original post: /r/datahoarder by /u/ByronWillis on 2024-07-28 03:44:32.

My order for some courtroom trial audio was just fulfilled on FTR (fortherecord.com). They don't provide a way to download it. It appears they want you to log in and use their player every time you want to hear it.

I have downloaded audio delivered electronically like this in the past using some Chrome extensions (Audio Downloader Prime in particular). However, it is not detecting the stream at all. I just installed a few other extensions both on Firefox and Chrome and tried them, but nothing is working. "Bulk Media Downloader" at least sees the numerous tiny files. Stacher says it doesn't support the domain.

When I inspect it and look at the network tab, I see that they are coming over as 1 second MPEG-DASH files (extension .m4s).

The only other way I know to do it is to just play the audio and record it through OBS / VoiceMeeter as it's being played. But this is a two week trial with 8 hours of audio a day, so I'm looking at having to run the computer for 100+ hours to do so. Possible, but very unattractive option.

Any ideas?

7889
 
 
The original post: /r/datahoarder by /u/alledian1326 on 2024-07-28 02:36:17.

i'm reading this book on the internet archive which you can borrow for up to 1 hour and continuously renew. i would like a PDF copy of this so i don't have to keep checking it out. there's no PDF download or other download option on the page: https://archive.org/details/interstellarmigr0000unse

any help is appreciated!

7890
 
 
The original post: /r/datahoarder by /u/hell77 on 2024-07-28 00:54:02.

Started two days ago transfering all the twitter link i follow to it and everything was fine at the beggining.

after that i organized what photos i wanted and transfer to the folder i wanted them in.

next day - didnt changed anything at setting before it, i clicked to update each batch and so on and some just started Redownloading the same photos so tried to mess with the confusing setting imo.

and it kinda seemed to fix(on some links)

clicked update again and it was updating to well, and then notice it was not finding/ downloading everything that was new, only some. they were not fails, there were not getting picked

any help how to, make the update actualy Update each time i click on it to download what i missed since last "update" batch?

ps: not iven the support could help me, only thing would get was "you probably change default settings" or "you were ip blocked" or "everything good on my side" kinda sucks when the person doing the program cant help

7891
 
 
The original post: /r/datahoarder by /u/jeananonymous on 2024-07-27 23:52:49.

Hi everyone,

I use gallery-dl to download some photo.

My output file is :

"filename": "{date:%Y.%m.%d}.{filename}.{extension}",

But here filename is a very long string of character and dont correspond to the url of the publication. Is there a way to get it ?

7892
 
 
The original post: /r/datahoarder by /u/Redbassover on 2024-07-27 21:14:16.

I have two folders with different and same files, with duplicates and a different folder structure and i want to compare them, move the missing data to the internal one and then mirror this folder to the external one. So my question is what program do I use to do this process with as little write and read as possible. I understand that freefilesync cannot ignore folder structure. And I want to avoid robocopy when comparing but I could use it when mirroring.

7893
 
 
The original post: /r/datahoarder by /u/gdvhgdb on 2024-07-27 20:31:57.

Kemono is just too slow for me and I follow several obscure authors that don't get updated there anymore, I've tried PatreonDownloader but the problem is that it only saves images, not the actual posts/text in there.

Oh yeah, I'm an avid reader btw. That's why unlike most who look for images I wanna see written stuff, but when I ran through the URL of an author through the downloader apparently there's nothing to save.

I definitely thought that storing text is easier than images but I guess I was wrong then, can I ask if there's a github or extension I can use to solve this problem of mine?

7894
 
 
The original post: /r/datahoarder by /u/Mattdaxter41 on 2024-07-27 20:23:56.

Hi,

I've been searching a bit on answers, and get sometimes a bit confuse about the answers.

I'd like to have an external hard drive accessible through local network, from which I could play any media (I don't need to convert it, just like a regular HDD, I'll just open it with VLC and watch it).

All seems to point to NAS, but with a lot of "overkill". People are saying that some (ds423+) can convert files on the fly to access it from your phone or other devices, and I don't really need that.

I'm looking what could be the best solution to just put medias on it (10To would suffice for now), and access it, over local area network if it need it's own power, or "plug and play" solution if it's powered by USB.

I don't need a solution with integrated backup plan, as I will move it elsewhere.

For more details :

I have 205 Gb of footage on game A

I have 726 Gb of footage on game B (With roughly 120Gb per raid, and 20Gb for PvP edits)

Is an external CD player worth it ?

It seems not because CD's seems to not be able to have more than 9Gb, from a really quick search I did

Are USB (or external hard drive) good enough ?

I could have a collection of USB/eHDD with post it on it like "game A", "Game B - Raids" and "Game B - PvP"

Or even more cut, like "Game B - Raid A", "Game B - Raid B"

What would you use it in my case, to have all your data easily accessible without being in your computer ?

7895
 
 
The original post: /r/datahoarder by /u/Great_Hat_223 on 2024-07-27 19:50:21.

I've tried to search explanations in the internet and on this subreddit, but information was incomplete for me to understand it properly. Therefore, I'd need some help from you who burn data to BD-Rs for archival purposes.

Here's my setup: I have data: photos, videos, files from my study period which I no longer need so frequently and want to back up for archive and store it in case needed. I really don't want to lose such data. Also, I don't want to upkeep the data by transferring it to a new medium in the human lifetime.

The best choice for me then is to burn M-Discs. I'm not going to question the validity of the manufacturers' claims here and I'll assume the BD-R M-Discs are capable of storing data for more than one human lifetime without corruption (e.g. 100+ years). I don't want to use 50 or 100GB versions, since they're multi-layered and are not tested for longevity as M-Disc DVDs. Because of the same single layer nature, I hope BD-R M-Discs should behave similarly as DVD counterparts.

I already have the discs and capable drive for burning m-discs, but when researching methods to burn them, there were numerous ways to achieve the same goal.

My goal: I plan to have two copies of the data on the M-Discs (one off-site) and additional copy of all data stored on an HDD. If I split the data throughout multiple discs, they should stay independent from each other (i.e. data corruption on one disc mustn't affect the data on the others).

Now I'm going to list the burning settings and formatting options that I considered relevant and need your help in deciding the best option considering my goal. I use both Windows and Linux systems and cross-compatibility would be a must, at least to read the files.

Burning Software and filesystems (ISO9960, UDF)

Here is generally the easiest. For Windows there is ImgBurn and for Linux k3b and I plan to use those for in my case. ImgBurn is not longer mainained though, please let me know if there's any burning software for Windows which is still maintained, preferably open source.

Since I want the Windows-Linux reading cross-compatibility, using Linux/Unix + Windows filesystem on k3b when burning should do the thing. For ImgBurn the default option is ISO9960 + UDF. Are these options fine? Should I use UDF?

What format of data to burn to a disc: disc image .iso, .zip archives or just simple data disc

Here I'm mostly confused how should I evaluate each option. For example, from my studies I have ca. 160GB of data and want to archive it onto BD-R M-Discs of 25GB each.

Data disc

If I go with simple data disc, I'd have to manually split the data so that it fits on 7 discs. By doing so, even if some bits get corrupted, it won't affect all the files on the disc. The only difficulty is that one would have to prearrange the files to fit onto the discs. I've seen this feature offered on CDBurnerXP for Windows.

Do you know some scripts that would be useful in arranging data into parts of certain size for burning?

Making Disc image (.iso)

When making disk image (.iso), I'd have to split the whole 160GB file into smaller parts. On Linux this is possible, for Windows I didn't look much into it. But I'm not even sure this will achieve the desired outcome, since the discs will become bootable.

The only concern that remains for me there is what if one of the seven discs gets corrupted? Can I still bring them together into original .iso file and extract some data from it? Is this even the concern for me since I'm using the before mentioned 3-2-1 method for archival/backup?

.zip archives

Here I could use 7zip to split large archive and even compress it. Is this method advisable? Here I'm quite sure that one part failure would compromise the whole archive, which I don't want. I'd like to have each disc independent of each other. Do .tar archives do the same thing as .zip archives?

What if I have a single file larger than 25GB?

If I have a very large video file I cannot just burn it that way. Should I make an .iso or (compressed) .zip archive and split it? What about parity data and/or ECC in this case?

I also read that files larger than 4GB would make some issues and UDF would be recommended. Can somebody confirm this? What do you do when burning large files that fit on one disc?

Should I use parity data and/or ECC from dvdisaster?

Basically the question is in the subtitle considering that I have two copies of the data on the M-Discs and one on the HDD and that every disc is independent from each other.

If yes, what option is the most robust? How should I proceed on storing parity data or ECC files?

Writing speed and burn verification

Here I'd use the lowest 2x speed and check "verify" when burning for the best results. Also I don't plan to copy the disc I just burnt to another one, but redo the burning from the original source. Please say if something should be changed.

Completely filling up the disc isn't advisable. But what is then the optimal limit to fill the discs?

Is there any other advice you'd like to tell me or any aspect that I missed that you consider important? Please let me know. All suggestions are welcome.

Thank you very much for helping me.

Much appreciated and a lot of success in your data hoarding.

7896
 
 
The original post: /r/datahoarder by /u/Bart2800 on 2024-07-27 18:53:26.

What do you all use as movie format converter? So AVI to MP4 for example. I saw Handbrake, is it any good? Until now I used VLC but I'm looking for better alternatives.

7897
 
 
The original post: /r/datahoarder by /u/rez0n on 2024-07-27 17:13:28.

I'm software developer and a bit data hoarder, I keep all historical project files and documents for any client. Please give your recommendations on files syncing if you have similar experience.

For many-many years I syncing all my files and projects using cloud providers, I started it since Dropbox has out. It is convenient and partially replaces backups in sense that I can completely format or destroy my computer and do not lost any data.

Latest few years I struggling with this and continuously migrating from one provider to other because each have different issues.

What I have

  • ~4 millions files
  • ~150GB total size
  • Most of files are Python / JS projects
  • Half of directories are git repositories
  • Some of projects contains node_modules / Python venv (it contains a lot symlinks)

What I want

  • Selective (smart) sync. I need to keep only active projects loaded on disk on per-file basis.
  • Fast syncing as it possible, taking into account crazy amount of files
  • Correct behaviour in syncing of git repos in regard of branch switching and so on.
  • Ideally support of .ignore file or alternative feature to ignore specific dirs from sync

My experience:

Right now I use MacOS on all computers, so cross-platform no more in a topic.

Dropbox - Before it migrated to MacOS "File Provider" it worked amazingly and mostly suit any my demands. Now sync are sluggish, many sync errors appears, when I set directory as "Available offline" it downloads partially and finish downloading some files only after access attempt.

No more many of Dropbox features like (LAN sync), defined location of files and access speed to files though File Provider are very slow.

Google Drive (File Stream) - Last time tried it two years ago and has been disappointed. Sync are fast, but it start full re-sync every week or two (which lasts few days).

Resilio Sync - Not a cloud, but also fine. Sync are fast and reliable, but no "Selective (smart) Sync". You can only completely remove directory from the computer.

Seafile - I was amazed by the speed of syncing, but I ceased testing almost on the start, because it can't sync all files, Python venv being broken on syncing due linked files.

iCloud - I made few attempts and right now I moved to it. It fast on initial sync, but then it becomes very slow when you need load remote files or after creating node_modules. Another critical issue, when I switch git branches it creates a lot of duplicate files as it seems sees two versions of the same file at a time.

Of course you can say that I can use git for this, but there are also a lot files that are should be ignored and did not pushed to the repository, but better if these files are same on all workstation for seamless work.

7898
 
 
The original post: /r/datahoarder by /u/ParisModelShow on 2024-07-27 13:54:51.

Hi all! Are HGST or Seagate regarded better helium drives or on par. 16TB or so. There is an article below that states Seagate (authors) is superior by design. Do any of them report Helium levels SMART 22? I read that SMART 22 isn't actually helium levels it uses a temperature sensor's ressistance to show that info. Or will the drive show other errors in SMART if there was a helium leak? Can I use them in an 3.5 external enclosure (any good you know)?

https://www.seagate.com/files/www-content/product-content/enterprise-hdd-fam/enterprise-capacity-3-5-hdd-10tb/_shared/docs/helium-drive-launch-tp686-1-1602us.pdf

7899
 
 
The original post: /r/datahoarder by /u/AhsanaqDesign on 2024-07-27 12:52:43.

As we know that Pinterest doesn't allow you to download it's videos but there are many thorid party tools to download Pinterest videos

I have used different tools but I found PinVideoDownloader.com is one of the best tool to download Pinterest images, gif and videos.

7900
 
 
The original post: /r/datahoarder by /u/Consistent-Scene2781 on 2024-07-27 10:27:53.

Hi everyone,

I'm looking for advice on how to reduce the file size of my scanned books while maintaining good quality. From my experience, Google Drive cannot view PDFs larger than 100 MB, so I need to ensure my files are under this limit.

I've been using ScanTailor, but I've encountered some issues:

  • When I use photo color mode, it produces TIFF RAW files over 4GB.
  • Black and white mode works well if the scanned document is good, but converting color documents to black and white sometimes ruins them.
  • For some scanned documents, I must use color mode at 600 dpi and sometimes downsize to 150 dpi, but this results in a loss of quality for color documents.
  • I convert the scanned documents with Adobe Acrobat Pro, which takes a long time, especially for OCR processing.
  • Sometimes, ScanTailor Advanced 2019.8.16 crashes when I use color mode with 600 dpi.

Does anyone have tips or recommended tools and settings for achieving this balance? I'm particularly interested in any techniques or software that can help optimize scanned PDFs without significantly compromising the quality of the text and images.

view more: ‹ prev next ›