freedomPusher

joined 4 years ago
MODERATOR OF
[–] freedomPusher@sopuli.xyz 0 points 1 year ago* (last edited 1 year ago) (2 children)

If you monitor IRC channels on email servers, you’ll find there are plenty of email admins unwilling to even go through the dkim and dmarc hoops. An fqdn check not on the sending server but on the FROM field of a msg is over-zealously above and beyond dkim and dmarc. I’m quite fine with not reaching these fringe servers. I can always decide from the bounce msg whether it’s worth my effort to dignify their excessive hoops with a transmission to their persnickety liking.

[–] freedomPusher@sopuli.xyz -1 points 1 year ago (7 children)

How do you expect to receive replies from clearnet users, or are you okay not receiving replies?

Indeed that’s the idea. If you’ve ever received a message where the sender’s address is “noreply@corp.xyz”, it’s similar. But in fact the onion address is slightly more useful than a “noreply” address because the responder would at least have the option of registering with an onion-capable email server to reply.

Imagine you want to email a gmail user. You can ensure that the message contains nothing you don’t mind sharing with a surveillance advertiser, but you cannot generally control what gets shared in the response. An onion address ensures that replies will be outside of Google’s walled garden, for example. That’s just one of several use cases.

Also most mail hosts these days toss emails that dont match dmarc/dkim/spf, which would be especially hard to do for an onion email

Those are server to server authentication protocols, not something that validates the functionality of a sender’s disclosed email address. Otherwise how would a bank send an announcement from a “noreply” address?

[–] freedomPusher@sopuli.xyz 2 points 1 year ago* (last edited 1 year ago) (2 children)

Do you know who does care? The email server you’re sending messages to, because spammers and scammers love to try and send email with fake from addresses.

The receiving servers do not generally care what’s in the FROM field. They care that the sending server they are connected to is authorized and has their SPF, DKIM, and DMARC shit together. It’s not for the receiving server to control the email aliases of individual senders. Some rare over-zealous servers will look at the FROM field and expect the domain to match but if I encounter that, the collateral damage is what it is. I can always still decide from there whether it’s worthwhile to go through extra hoops.

[–] freedomPusher@sopuli.xyz 1 points 1 year ago* (last edited 1 year ago)

People are pushovers and tend not to give a shit about banks excessively following the know-your-customer protocol well beyond what the law even requires. So why not mirror that success in the telecom domain? Followed by grocery stores and car mechanics next…

[–] freedomPusher@sopuli.xyz 1 points 1 year ago* (last edited 1 year ago)

Are you wanting to have a .onion TLD email address,

Yes, and that much exists. There are onion email providers, but when you email a clearnet recipient, they typically convert your onion email address to a clearnet address. That’s useful in most situations but there are also several use cases for not doing the conversion. But finding a service that accommodates the other use cases is hard, considering onion email is rare in itself.

and be able to communicate with non-TOR web servers?

No, nothing to do with the web. Just email.

The host needs to be able to look up addresses, and resolve them to a location.

Only for replies. But not all messages need a reply. See my other msg.

It would require having clearnet servers also connected to the TOR network which I would imagine is incredibly unlikely.

Those exist already (danwin, riseup, onionmail, etc). But they operate on the assumption that senders always want replies from the recipient to be possible via their receiving server. That’s not always desirable.

In the same way you can browse non onion sites through TOR but not the other way around,

There is a service that enables clearnet users to reach onion services (onion.to, onion.cat, etc), but this is unrelated. Web is unrelated.

you would likely be able to send email but not receive them

Bingo. That’s the point in some of the use cases.

[–] freedomPusher@sopuli.xyz 1 points 1 year ago

Delete hosted cloud. Move back to hosting your own.

How does that address the problem or promote privacy? Self-hosting makes it even more trivially easy to identify you. E.g. if I run my own Lemmy server, it would transact on an IP address that points to me. By anonymously creating an acct sopuli.xyz and using Tor, doxxing becomes harder.

[–] freedomPusher@sopuli.xyz 0 points 1 year ago* (last edited 1 year ago) (4 children)

Not really an option

Sure it is. I can theoretically¹ do it myself with my mail server. If you use a mail client like (neo)mutt, you can literally free type whatever you want to put in the FROM field. IIRC, this contradicts no RFCs so long as there is a syntactically valid email address.

Ever get an email with a bogus address like “noreply@corporation.xyz”? It’s essentially the same. Not all e-mail addresses in the FROM field go to valid inboxes -- nor are they required to.

footnote 1The reason I say “theoretically” is that some exceptional SMTP servers check that the domain portion of the FROM email passes an MX lookup or that the DNS lookup matches the sending server. It’s a rare configuration. I have no domain name so my mail server always sends msgs with a “spoofed” email address (which is often valid but not related to my IP). I also write in completely bogus email addresses in some cases where no reply is needed. Very few servers reject on that basis. The other complication is that many mail services disallow outbound messages with a different address than what they assigned to a user.

since the onion TLD isn’t accessible to clearnet servers. How are email servers supposed to reach out the onion domain name and mail server if they can’t resolve it?

You’re talking about using the FROM address for replying purposes. The point of having this option is to make replies very difficult, but still possible.

Mail servers can be configured to handle onion addresses. I’ve configured postfix to do that. But indeed most servers are not configured to handle onions, which any users who make use of the feature would need to be aware of. It’s a useful scenario because it can be used to force recipients out of Google’s and Microsoft’s walled gardens, and give them incentive to join the free world away from surveillance advertisers, for example. They must join an onion-capable email service if they want to reply.

[–] freedomPusher@sopuli.xyz 1 points 1 year ago

I can barely see the point to BIOS passwords. They are slightly useful if you don’t want guests using a machine for some reason. If you don’t have a bios pw, the OS login is good enough unless you need to stop them booting their own media. All desktops are rightfully easy to clear the bios. There are jumpers specifically for this purpose, apart from also just popping out the cr3202 battery or unseating the bios chip (old models). The bios pw does not (and should not) protect from data access at the hands of someone who can open box.

The only failure I see here is the fact that Lenovo tried to make the bios unclearable in the first place, thus increasing e-waste. That’s the real story. The security fail is nothing interesting.. it’s the attempt of ecocide that should have the focus.

[–] freedomPusher@sopuli.xyz 1 points 1 year ago* (last edited 1 year ago)

Law is driven by philosophy. When discussing high-level laws at the constitutional level and above (international/human rights), “law” loses effectiveness as such and becomes more of a philosophical guide. It’s not concrete when specific scenarios are not pinned down, and rarely enforced as a consequence. There is an abstract human right that we have freedom of religion, but national law can often contradict human rights.

There are no Amish communities in Europe (and AFAIk, no notable religions that oppose the digital transformation). So there would be not likely be national law that protects them. The question is hypothetical. Answering it requires understanding the meaning, purpose, and history of the freedom of religion, which itself would never be elaborated in law. The law is clean, hard and fast, without history and usually without rationale.

It’s an inherently philosophical question but with legal interplay. So it’s a 10,000 foot view question of how freedom of religion gets implemented in Europe. The philosophy cannot be neglected because it’s the driver.

Namely: Does Belgium law require agencies and companies to provide offline interfaces if a religion requires not using digital services/technology.

I would guess unlikely because there are no such religions in Belgium, AFAIK. The Amish would be in for a struggle. They would have to bring a complaint to court about digital transformation excluding them with no concrete law covering them, and try to cling to that rarely enforced body of human rights law. They might prevail in a high court, but what about someone who is not Amish, but who has the same moral objections? The Amish are Christians who morally object to lots of technology but strictly speaking the anti-tech is not really driven by Christianity. It’s more of a culture that is fused with their religion, which enables them to benefit from religious protections despite Christianity not being the driver. So a non-religious person who finds the forced use technology to be as unconscionable as an Amish practicioner would be equally oppressed, but would a court recognize this? Probably not, but if Amish were to arrive, then the question is would the law be written specifically to protect the Amish or would it be generalized enough that non-religious people would benefit? It’s all a question/prediction based on philosophy, psychology, law, and history.

[–] freedomPusher@sopuli.xyz 1 points 1 year ago

StreetComplete shows me no map, just quests on a blank canvas. OSMand shows my offline maps just fine, but apparently StreetComplete has no way to reach the offline maps. I suppose that’s down to Android security -- each app has it’s own storage space secure from other apps.

In principle, we should be able to put the maps on shared SD card space and both apps should access it. But StreetComplete gives no way in the settings of specifying the map location. And apparently it fails to fetch an extra copy of the maps as well in my case.

[–] freedomPusher@sopuli.xyz 2 points 1 year ago* (last edited 1 year ago)

These “demographics” are not being excluded, ATM they’re just being inconvenienced.

Have another look at that list. People using public libraries are not necessarily tech savvy; more likely the contrary. People whose ISP chose to implement CGNAT don’t necessarily even have any idea that they are sharing a WAN IP address. AOS ≤6 users are likely clueless; they will just blame their phone and buy another.

I’m not sure it’s fair to say all Tor users are hackers considering Tails is simply a matter of booting with a USB stick inserted. Many VPN users are avg Joes just trying to reach BBC or torrents. VPN services have had to lower the bar to be usable by low tech folks.

Not everyone is a tech wizard of the highest order, so I really can’t fault their web masters for using what’s easy and convenient.

I agree it’s much more likely a case of ignorance than malice. But nonetheless they have a professional duty to become informed about the consequences of their choices. Using CF is a proactive move¹. They are responsible for excluding people and should be held accountable until they get it right.

¹ OTOH, 4 of those parties have apparently outsourced to “Nationbuilder”, whereas Défi apparently did their own CF deployment. Whoever is behind Nationbuilder owns their incompetence. It’s reasonable to expect an outsourced contractor to know what they’re doing. The political parties should choose a different contractor who can do the job without excluding people.

update


Digital exclusion is a big problem in Belgium. Current politicians are dismantling analog infra and forcing people online for government transactions. And those gov resources are often broken and/or exclusive. People are often forced to disclose more info than necessary (in violation of the GDPR minimisation principle). So it is actually becoming quite important to elect people who oppose digital exclusion.

[–] freedomPusher@sopuli.xyz 1 points 1 year ago* (last edited 1 year ago) (2 children)

Cloudflare is not transparent about who they block. They essentially (and falsely) say they only block baddies, in so many words. A list of groups found to be marginalized can be found here:

https://thefreeworld.noblogs.org/post/2024/03/18/cloudflare-has-created-the-largest-most-rigidly-exclusive-walled-garden-in-the-world/

A consequence of the non-transparency is that the list is inherently incomplete. Certainly if you use Ungoogled Chromium to reach the websites of those parties over the Tor network, you will be blocked hard and fast (not even a CAPTCHA offer).

For example:

 

cross-posted from: https://sopuli.xyz/post/9861733

I would cast my drop-in-the-ocean vote if it didn’t require needlessly reckless disclosures. The question is- which states offer more privacy than others? These are some of the issues:

publication of residential address


It’s obviously fair enough that you must disclose your residential address to the election authority so you get the correct ballot. But then the address is public. WTF? I’m baffled that the voter turnout isn’t lower.

Exceptionally, Alaska enables voters to also supply a mailing address along with their residential address. In those cases, the residential address is not made public. But still an injustice as PO Boxes are not gratis so privacy has a needless cost.

Some states give the mailing address option exclusively to battered spouses. So if you are a victim of domestic abuse, you can go through a process by which you receive an address for the public voting records that differs from your residential address. Only victims of domestic abuse get privacy that should be given to everyone.

publication of political party affiliation


You are blocked from voting in primary elections unless you register a party affiliation, in which case you can only vote in the primary election of that party. A green party voter cannot vote in the democrat primary despite the parties being similar. The party you register in is public. So e.g. your neighbors, your boss, and your prospective future boss can snoop into your political leanings.

AFAIK, this is the same for all states.

publication of your voting activity (which is used for shaming)


Whether you voted or not is public. If you register to vote but do not vote, it’s noticed. There is a shaming tactic whereby postcards are sent saying “your neighbors the Johnsons at 123 Main St. voted early -- will you do your civic duty too? Note that the McKinneys at 125 Main St. have not voted; perhaps you can remind them?” They of course do this in an automated way, so non-voters know their neighbors are receiving postcards that say they did not partake in their civic duty.

forced disclosure to Cloudflare


These states force all voter registrations through Cloudflare:

  • Arizona
  • Florida
  • Georgia
  • Hawaii
  • Idaho
  • New York
  • Ohio
  • Rhode Island
  • Washington

That’s not just public info, but everything you submit with your registration including sensitive info like DL# and/or SSN goes to Cloudflare Inc. Cloudflare is not only a privacy offender but they also operate a walled garden that excludes some demographics of people from access. Voters can always register on paper, but whoever the state hires to do the data entry will likely use the Cloudflare website anyway. So the only way to escape Cloudflare getting your sensitive info in the above-mentioned states is to not vote.

To add to the embarrassment, the “US Election Assistance Commission” (#USEAC) has jailed their website in Cloudflare’s walled garden. Access is exclusive and yet they proudly advertise: “Advancing Safe, Secure, Accessible Elections”.

solutions


What can a self-respecting privacy seeker do? When I read @BirdyBoogleBop@lemmy.dbzer0.com’s mention¹ of casting a “spoiled” vote which gets counted, I thought I’ll do that.. but then realized I probably can’t even get my hands on a ballot if I am not registered to vote. So I guess the penis drawing spoiled vote option only makes a statement about the ballot options. It’s useless for those who want to register their protest against the voter registration disclosures.

Are there any states besides Alaska that at least give voters a way to keep their residential address out of publicly accessible records?

  1. it was mentioned in this thread: https://lemmy.dbzer0.com/post/8502419
 

I would cast my drop-in-the-ocean vote if it didn’t require needlessly reckless disclosures. The question is- which states offer more privacy than others? These are some of the issues:

publication of residential address


It’s obviously fair enough that you must disclose your residential address to the election authority so you get the correct ballot. But then the address is public. WTF? I’m baffled that the voter turnout isn’t lower.

Exceptionally, Alaska enables voters to also supply a mailing address along with their residential address. In those cases, the residential address is not made public. But still an injustice as PO Boxes are not gratis so privacy has a needless cost.

Some states give the mailing address option exclusively to battered spouses. So if you are a victim of domestic abuse, you can go through a process by which you receive an address for the public voting records that differs from your residential address. Only victims of domestic abuse get privacy that should be given to everyone.

publication of political party affiliation


You are blocked from voting in primary elections unless you register a party affiliation, in which case you can only vote in the primary election of that party. A green party voter cannot vote in the democrat primary despite the parties being similar. The party you register in is public. So e.g. your neighbors, your boss, and your prospective future boss can snoop into your political leanings.

AFAIK, this is the same for all states.

publication of your voting activity (which is used for shaming)


Whether you voted or not is public. If you register to vote but do not vote, it’s noticed. There is a shaming tactic whereby postcards are sent saying “your neighbors the Johnsons at 123 Main St. voted early -- will you do your civic duty too? Note that the McKinneys at 125 Main St. have not voted; perhaps you can remind them?” They of course do this in an automated way, so non-voters know their neighbors are receiving postcards that say they did not partake in their civic duty.

forced disclosure to Cloudflare


These states force all voter registrations through Cloudflare:

  • Arizona
  • Florida
  • Georgia
  • Hawaii
  • Idaho
  • New York
  • Ohio
  • Rhode Island
  • Washington

That’s not just public info, but everything you submit with your registration including sensitive info like DL# and/or SSN goes to Cloudflare Inc. Cloudflare is not only a privacy offender but they also operate a walled garden that excludes some demographics of people from access. Voters can always register on paper, but whoever the state hires to do the data entry will likely use the Cloudflare website anyway. So the only way to escape Cloudflare getting your sensitive info in the above-mentioned states is to not register to vote.

To add to the embarrassment, the “US Election Assistance Commission” (#USEAC) has jailed their website in Cloudflare’s walled garden. Access is exclusive and yet they proudly advertise: “Advancing Safe, Secure, Accessible Elections”.

solutions


What can a self-respecting privacy seeker do? When I read @BirdyBoogleBop@lemmy.dbzer0.com’s mention¹ of casting a “spoiled” vote which gets counted, I thought I’ll do that.. but then realized I probably can’t even get my hands on a ballot if I am not registered to vote. So I guess the penis drawing spoiled vote option only makes a statement about the ballot options. It’s useless for those who want to register their protest against the voter registration disclosures.

Are there any states besides Alaska that at least give voters a way to keep their residential address out of publicly accessible records?

  1. it was mentioned in this thread: https://lemmy.dbzer0.com/post/8502419
 

In the US banks are capturing the voices of their customers who contact their call centers for any reason. So if a USian vocally says something controversial they probably have no hope of anonymity if they called their bank in recent years.

Is the same thing not happening in Russia and Israel? An IDF soldier came on broadcast radio and criticized Israel, and a Russian citizen criticized Putin. Shouldn’t they be concerned about doxxing risks?

It would be reckless if the radio station did not disguise their voices, but I don’t get the impression their voices are being disguised. So I just wonder if voice disguising tech is so good at making the voice sound natural that it’s not detectable.

 

This is interesting but quite unfortunate. As individuals we often spot #GDPR infringements in situations where we are not a victim. The GDPR does not empower us to act with any slight expectation of getting results. There is no reporting mechanism and no remedial correction if the complainant’s own personal data was not mishandled. No Article 77 possibility.

Paragraph 2 page 3:

The GDPR does not explicitly define what constitutes a complaint but Article 77 gives a first understanding providing that “every data subject shall have the right to lodge a complaint (…) if the data subject considers that the processing of personal data relating to him or her infringes this Regulation”.

Page 4 examples of non-complaints:

  • a suggestion made by a natural person that he or she thinks that a particular company is not compliant with the GDPR as long as he or she is not among the data subjects.

There is a hack but it’s purely the DPA’s discretion whether to act. From page 5:

The supervisory authority may act upon its own motion (ex officio), e.g., after being “informed otherwise of situations that entail possible infringements” 6 (e.g. by the press, another administration, a court, or another private company, a hint by a natural person which is however not a complaint within the meaning of Article 77).

So a natural person can tattle (tip off) the DPA but the DPA can simply ignore it. If the DPA feels like it, they can act on it as their own initiative (not under Art.77), which means the whistle blower can (and likely will) be kept out of the loop and in the dark. So such reports might as well be sent anonymously. And if it’s not a big interesting case (e.g. involving a tech giant), it’s probably unlikely a DPA will act.

Why this is a problem


I often want to engage with a data controller but their procedures demand irrelevant info in violation of data minimisation. In principle I should be able to use a corrective process to make the data controller compliant before I engage them. There is no useful mechanism unless a prospective data subject partakes in subjecting themself to a breach (self harm) before filing an Art.77 complaint.

 

I’m very grateful that #AnonymousOverflow exists and was already in place to give us refuge when #Stackexchange et al returned to #Cloudflare’s jail. I use this search service because it automatically integrates (SE→AO) replacement:

https://search.fabiomanganiello.com/search

A search led to this thread:

https://overflow.manganiello.tech/exchange/tex/questions/225027/how-to-create-new-font-which-is-thicker-version-of-computer-modern

The three links in the itemized links all point to Stackexchange, which puts the exclusion problem back in our face -- for those who are blocked from Cloudflare. Anonymous Overflow (AO) should eat its own #dogFood. Like the fabiomanganiello search service, AO should replace SE links with AO links within SE pages.

Yes, it may be a bit tricky because AO has a number of instances which go up and down. The onion ones are quite flaky. In principle, SE links should be replaced with the same instance the article is viewed on.

This #bug is posted here because the bug tracker is exclusively on MS #Github:

https://github.com/httpjamesm/AnonymousOverflow

 

Normally it’s possible to import comments into the Sopuli timeline by querying on URLs of external comments that are not yet local. Thereafter, it’s possible to interact with imported msgs.

But when I linked to an external comment (https://jlai.lu/comment/5309447) in this thread, and then later queried the URL of the comment, the search stops upon finding my own local mention of that comment. If the search feature is going to stop upon finding local results, then there needs to be a “go deeper” button, or an “import” button to give a means to import a comment.

As a consequence of this bug, I cannot reply to https://jlai.lu/comment/5309447 from Sopuli.

Also notable is if the search category is narrowed from “ALL” to “URL”, nothing results.

#LemmyBug

 

cross-posted from: https://sopuli.xyz/post/9076220

I posted this thread on jlai.lu. I got no replies as far as I could see from sopuli -- no notifications, and when I enter that thread there are still zero replies. But when I visit the thread on the hosting instance, I see a reply. This behavior is the same as if I were blocking that community -- but I am not.

When I search in sopuli for the direct link to the comment, the search finds it. And then I was able to forcibly interact with the comment.

I have to wonder how often someone replies to me and I have no idea because the response is hidden from me. This is a serious bug. Wholly unacceptable for a platform designed specifically for communication.

update 1 (another occurrence)


Here’s another thread with the same issue. Zero replies when I visit that thread mirror within sopuli, but 3 replies when visiting direct. I was disappointed that high-effort post got no replies. Now 2 months later I see there actually were replies. I will search those comment URLs perhaps in a couple days to interact. But I’ll hold off in case someone wants to investigate (because I think the act of searching those URLs results in copying the comments which could interfere with the investigation).

update 2 (subscription relevancy)


I was asked if I am subscribed to the community. Good question! The answer is no, so there’s a clue. Perhaps mentions do not trigger notifications if no one on the instance of the mentioned account is subscribed to the community. This could be the root cause of the bug.

 

I posted this thread on jlai.lu. I got no replies as far as I could see from sopuli -- no notifications, and when I enter that thread there are still zero replies. But when I visit the thread on the hosting instance, I see a reply. This behavior is the same as if I were blocking that community -- but I am not.

When I search in sopuli for the direct link to the comment, the search finds it. And then I was able to forcibly interact with the comment.

I have to wonder how often someone replies to me and I have no idea because the response is hidden from me. This is a serious bug. Wholly unacceptable for a platform designed specifically for communication.

update 1 (another occurrence)


Here’s another thread with the same issue. Zero replies when I visit that thread mirror within sopuli, but 3 replies when visiting direct. I was disappointed that high-effort post got no replies. Now 2 months later I see there actually were replies. I will search those comment URLs perhaps in a couple days to interact. But I’ll hold off in case someone wants to investigate (because I think the act of searching those URLs results in copying the comments which could interfere with the investigation).

update 2 (subscription relevancy)


I was asked if I am subscribed to the community. Good question! The answer is no, so there’s a clue. Perhaps mentions do not trigger notifications if no one on the instance of the mentioned account is subscribed to the community. This could be the root cause of the bug.

#LemmyBug

 

Kensanata’s mastodon-archive tool was originally working as expected to archive posts from eattherich.club. Then out of the blue one day it started printing this:

Loading existing archive: eattherich.club.user.bob.json
Get user info
Get new statusesTraceback (most recent call last):
00] STDIN                                           File "/usr/lib/python3/dist-packages/mastodon-archive/mastodon-archive.py", line 5, in <module>
▏
Seen 10 duplicates, stopping now.
Use --no-stopping to prevent this.
Added a total of 25 new items
Get new favourites    mastodon_archive.main()
                                               File "/usr/lib/python3/dist-packages/mastodon-archive/mastodon_archive/__init__.py", line 333, in main
             args.command(args)
                                 File "/usr/lib/python3/dist-packages/mastodon-archive/mastodon_archive/archive.py", line 148, in archive
▏
Seen 10 duplicates, stopping now.
Use --no-stopping to prevent this.
Added a total of 7 new items
Get bookmarks (this may take a while)    bookmarks = mastodon.bookmarks()
                                                                           File "<decorator-gen-59>", line 2, in bookmarks
                                                                                                                            File "/usr/lib/python3/dist-packages/mastodon/Mastodon.py", line 96, in wrapper
                                                                   raise MastodonVersionError("Version check failed (Need version " + version + ")")
        mastodon.Mastodon.MastodonVersionError: Version check failed (Need version 3.1.0)

The count of new items never resets to zero. It should go back to zero after every fetch, so this implies fetching no longer occurs (or at least it no longer finishes). The current version of eattherich.club is Mastdoon v4.2.5. Not sure if a version upgrade would be related. Other Mastodon instances do not have this issue.

The bug tracker is on MS Github, thus out of reach for me:

https://github.com/kensanata/mastodon-archive/issues

 

cross-posted from: https://sopuli.xyz/post/8936481

I would like to get to the bottom of what I am doing wrong that leads to black and white documents having a bigger filesize than color.

My process for a color TIFF is like this:

tiff2pdfocrmypdfpdf2djvu

Resulting color DjVu file is ~56k. When pdfimages -all runs on the intermediate PDF file, it shows CCITT (fax) is inside.

My process for a black and white TIFF is the same:

tiff2pdfocrmypdfpdf2djvu

Resulting black and white DjVu file is ~145k (almost 3× the color size). When pdfimages -all runs on the intermediate PDF file, it shows a PNG file is inside. If I replace step ① with ImageMagick’s convert, the first PDF is 10mb, but in the end the resulting djvu file is still ~145k. And PNG is still inside the intermediate PDF.

I can get the bitonal (bilevel) image smaller by using cjb2 -clean, which goes straight from TIFF to DjVu, but then I can’t OCR it due to the lack of PDF intermediate version. And the size is still bigger than the color doc (~68k).

#askFedi

update


I think I found the problem, which would not be evident from what I posted. I was passing the --force-ocr option to ocrmypdf. I did that just to push through errors like “this doc is already OCRd”. But that option does much more than you would expect: it transcodes the doc. Looks like my fix is to pass --redo-ocr instead. It’s not yet obvious to me why --force-ocr impacted bilevel images more.

 

I would like to get to the bottom of what I am doing wrong that leads to black and white documents having a bigger filesize than color.

My process for a color TIFF is like this:

tiff2pdfocrmypdfpdf2djvu

Resulting color DjVu file is ~56k. When pdfimages -all runs on the intermediate PDF file, it shows CCITT (fax) is inside.

My process for a black and white TIFF is the same:

tiff2pdfocrmypdfpdf2djvu

Resulting black and white DjVu file is ~145k (almost 3× the color size). When pdfimages -all runs on the intermediate PDF file, it shows a PNG file is inside. If I replace step ① with ImageMagick’s convert, the first PDF is 10mb, but in the end the resulting djvu file is still ~145k. And PNG is still inside the intermediate PDF.

I can get the bitonal (bilevel) image smaller by using cjb2 -clean, which goes straight from TIFF to DjVu, but then I can’t OCR it due to the lack of PDF intermediate version. And the size is still bigger than the color doc (~68k).

update


I think I found the problem, which would not be evident from what I posted. I was passing the --force-ocr option to ocrmypdf. I did that just to push through errors like “this doc is already OCRd”. But that option does much more than you would expect: it transcodes the doc. Looks like my fix is to pass --redo-ocr instead. It’s not yet obvious to me why --force-ocr impacted bilevel images more.

#askFedi

 

The flagship instance for Matrix demonstrates the use of Cloudflare, which was found to be necessary to defend against DoS attacks. This CaaC (Cloudflare-as-a-Crutch) design has many pitfalls & problems, including but not limited to:

  • digital exclusion (Cloudflare is a walled garden that excludes some groups of people)
  • supports a privacy hostile tech giant
  • adds to growth and dominance of an oppressive force
  • exposes metadata to a privacy offender without the knowledge and consent of participants
  • reflects negatively on the competence, integrity, and digital rights values of Matrix creators
  • creates a needless dependency on a tech giant

#CaaC needs to be replaced with a #securityByDesign approach. Countermeasures need to be baked into the system, not bolted on. The protocol should support mechanisms such as:

  • rate limiting/tar pitting
  • proof-of-work with variable levels of work and a prioritization of traffic that’s proportional to the level of work, which can be enabled on demand and generally upon crossing a load threshold.
  • security cookie tokens to prioritize traffic of trusted participants

Sadly, #Matrix is aligned with another nefarious tech giant, and has jailed its project in Microsoft Github. And worse, they have a complex process for filing bugs/enhancements against the spec:

https://github.com/matrix-org/matrix-spec-proposals/blob/main/README.md

Hence why this bug report is posted here.

view more: ‹ prev next ›