I feel with you

Posted Aug 31, 2024 18:37 UTC (Sat) by ballombe (subscriber, #9523)
In reply to: I feel with you by dskoll
Parent article: A SpamAssassin surprise

There are webforms that reject emails with 3rd level domain like user@foo.bar.com.
Yes this is stupid.

I feel with you

Posted Aug 31, 2024 20:27 UTC (Sat) by Wol (subscriber, #4433) [Link] (1 responses)

Dunno about other countries, but my email has a "top level domain" of .org.uk. Where does that leave me?

Cheers,
Wol

I feel with you

Posted Sep 2, 2024 10:26 UTC (Mon) by pbonzini (subscriber, #60935) [Link]

I suppose that's easily fixed with a better regular expression.

(That's sarcasm, or rather disillusionment, if that wasn't clear).

I feel with you

Posted Aug 31, 2024 22:01 UTC (Sat) by dskoll (subscriber, #1630) [Link] (20 responses)

Wow, I'm surprised. As I said, I've never encountered such a web site, and if I ever did, I'd quickly decide that it's product or service is something I can do without.

I feel with you

Posted Sep 1, 2024 7:04 UTC (Sun) by ceplm (subscriber, #41334) [Link] (19 responses)

Have you ever heard about, let’s say, BBC of University of Oxford?

That is their canonical site are https://www.bbc.co.uk and https://www.ox.ac.uk/, respectively, all UK domains are thus divided (and Aussies copies the scheme from them, https://www.abc.net.au/).

I feel with you

Posted Sep 1, 2024 9:51 UTC (Sun) by excors (subscriber, #95769) [Link] (18 responses)

I think dskoll was talking about sites that reject email addresses from such domains, not about sites that are hosted on such domains.

But given how common those domains have been since the early days of the internet, I'd guess the email validation systems usually aren't quite that dumb, they'll probably use something like the Public Suffix List (https://publicsuffix.org/) which tries to identify all the registrar-controlled suffixes that people can register under (so it has "com" and "uk" but also "ac.uk", "co.uk", "*.sch.uk" (the third level is the local authority name), etc). Or they'll use an incomplete, outdated, broken version of that list encoded in a regex they found on some blog two decades ago, which is fine for all the people with hotmail.co.uk addresses, and only causes a problem for the relatively few people using mail.companyname.com or new TLDs.

(The PSL is also used by e.g. DMARC to identify domains that belong to the same organisation: https://dmarc.org/2023/05/m3aawg-calls-for-coalition-to-s...)

I feel with you

Posted Sep 1, 2024 12:33 UTC (Sun) by yeltsin (guest, #171611) [Link] (8 responses)

Speaking as a web developer, you're giving way too much credit to (some) web developers. Most email validation I've seen relies on regexes, or similar logic written by hand. The good news is they're often easy to circumvent if you know your way around the browser developer tools, because there is no server side validation. Sometimes just disabling JavaScript is enough to get through.

I feel with you

Posted Sep 2, 2024 2:29 UTC (Mon) by NYKevin (subscriber, #129325) [Link] (7 responses)

The other problem is web developers who blithely assume that an email address is suitable as a username. The standards give you basically no guarantees at all about the semantics of the local part (the part before the @). A mail server is fully within its rights to decide that foobar@example.com, FOO.BAR@example.com, and something+completely+different@example.com are all the same mailbox, or all different mailboxes, and can even follow different rules for different mailboxes (perhaps in order to transition to a new set of semantics, while grandfathering old mailboxes to avoid inconvenience to users), as well as change all of these rules whenever it feels like it. The RFCs are completely clear that nobody other than the recipient server should be parsing or interpreting the local part, for the explicit purpose of allowing mail servers to do things like this. So if you use email addresses as usernames, you have accepted the reality that multiple "different" accounts may send notification emails to the same mailbox in some cases. Now you throw in attackers trying to take over an innocent user's account (or maybe just cause annoyance), and the fact that the average user has no idea that this feature exists (and is therefore perfectly happy to click a "confirm your email address" link from a service where they really do have an account), and all sorts of unpleasant things might come of that.

I feel with you

Posted Sep 2, 2024 9:34 UTC (Mon) by yeltsin (guest, #171611) [Link] (6 responses)

To give a concrete example, sourcehut uses '~' and '/' in the left part of the address, which breaks some badly written email clients and even some mail providers. Their response is always "wontfix, report it to your email client/mail provider", which is probably for the best.

e.g. ~sircmpwn/sr.ht-packages@lists.sr.ht

I guess it's one more reason we need diversity in both, because it forces everybody to comply with the standards. Gmail is very specific on what you can (or cannot) do with your username; other large providers are similar. If that's the only thing everyone is using, it is the new de-facto standard.

I feel with you

Posted Sep 2, 2024 9:56 UTC (Mon) by anselm (subscriber, #2796) [Link] (3 responses)

I'm generally happy that most places seem to have caught on to the idea that foo+bar@example.com is, in fact, a valid e-mail address in spite of the “+”. This used to be a major annoyance a few years ago.

I feel with you

Posted Sep 3, 2024 0:21 UTC (Tue) by khim (subscriber, #9252) [Link] (2 responses)

Gmail did that. In Gmail everything after + is ignored so you may used JoeAverage+shopping@gmail.com on one web site and JoeAverage+work@gmail.com on some other web site (and can use filters to label them differently on the receiving site).

Gmail is large enough and common enough to force developers to accept these.

I'm not sure smaller mail providers can do that.

I feel with you

Posted Sep 3, 2024 7:41 UTC (Tue) by Wol (subscriber, #4433) [Link] (1 responses)

> Gmail is large enough and common enough to force developers to accept these.

That "+" thing is in the original RFC. Gmail didn't force others to accept it - they would have *been* forced to accept it.

Cheers,
Wol

I feel with you

Posted Sep 3, 2024 14:08 UTC (Tue) by khim (subscriber, #9252) [Link]

Whenever thinks worked that way? Things like !, /, or even @ are in the original RFC, too. Yet web sites don't support them (or, at least, rarely do).

Why do you think + is an exception?

I feel with you

Posted Sep 2, 2024 13:59 UTC (Mon) by dskoll (subscriber, #1630) [Link] (1 responses)

Yes, people don't understand email. It's very annoying.

Web developers: The only thing you should do to validate an email address is (1) check that it contains an @ character, and (2) check that everything after the last @ character is a domain name that has either an A or an MX record that ultimately points to at least one routable IPv4 or IPv6 address. That's it. Nothing else.

I say last @ character because "@$foo"@example.com is theoretically a valid form of email address, though I doubt it'd actually work in many places.

I feel with you

Posted Sep 3, 2024 13:44 UTC (Tue) by brunowolff (guest, #71160) [Link]

> I say last @ character because "@$foo"@example.com is theoretically a valid form of email address, though I doubt it'd actually work in many places.

This illusrates another issue with email addresses. There are at least three different encodings for them (raw, 821 and 822). "@$foo"@example.com might be a raw address or it might be an 821 or 822 encoding of @$foo@example.com .

Badly coded websites

Posted Sep 1, 2024 13:20 UTC (Sun) by farnz (subscriber, #17727) [Link] (8 responses)

Unfortunately, I have had to use a gmail.com address at more than site that's rejected my .uk address for having "too many parts". I've also had rejections of my gmail.com address (easily worked around, thanks to Google's flexibility), since you can't have a dot to the left of the @ sign according to some web developers.

Basically, think of a way to get it wrong - there will be sites that get it wrong. Thankfully, since it's just e-mail addresses, it's relatively easy to get a new "free" address from somewhere like GMail that works around the mistakes, but I've encountered similar stupidities with credit card details, where a web developer has confidently assumed that certain things are not possible, and been absolutely floored to discover that banks have issued "impossible" cards (completely within the card spec, but not what the web developer expected, because of things like a CV2 number that "looked like" an expiry date).

Badly coded websites

Posted Sep 1, 2024 17:53 UTC (Sun) by Heretic_Blacksheep (guest, #169992) [Link] (1 responses)

I think a lot of this has to do with how many companies hire code boot camp 'programmers' who may know how to write generic code in whatever language the code boot camp used, but only have superficial information about how anything else on the computer, networks, and such actually work. All they want is a website that does X, Y, and Z, won't pay more than N USD, and they basically get what they paid for.

That may be fine for personal, or small business pages that are largely just a static advertisement for a person or business, but anytime you need more than that these graphic design houses aren't qualified.

Badly coded websites

Posted Sep 2, 2024 8:42 UTC (Mon) by farnz (subscriber, #17727) [Link]

Combine that with an effort to minimise the number of back end errors you see (certainly the case with the e-mail and credit card ones I saw), where you're trying not to send data to the back end if you "know" it's invalid, and you have something that apparently works until you get real users on it.

The trouble is that when you try to fix "mistakes" by the user (and I have no doubt that some users would type their expiry of 1/23 into the CV2 box as 123), you have to ensure that this can't be a genuine user - asking "are you sure - 123 looks like it might be your card expiry" is OK to reduce back end error rates, but "123 is not a valid CV2" is simply wrong, when it's the number on my card.

Badly coded websites

Posted Sep 14, 2024 11:25 UTC (Sat) by sammythesnake (guest, #17693) [Link] (5 responses)

I'm in two minds about this one, having implemented the entire spec twice for different employers:

First, the spec is easily found and clearly defined - it's not rocket surgery to get the whole thing right (including knowing if the card is a visa or amex or whatever, so don't ask the user!)

On the other hand, the spec itself is a bear, with 1001 different historical warts, so if you think you find a way to simplify it (which inevitably means being wrong, of course) I can absolutely understand the temptation to do so...

Ideally, the banks etc. would all get together and agree to retire all but one checksum algorithm, standardise on one number length etc. so that the spec can be implemented in half a dozen lines of code, but that's herding cats :-(

Badly coded websites

Posted Sep 14, 2024 14:09 UTC (Sat) by Wol (subscriber, #4433) [Link]

> First, the spec is easily found and clearly defined - it's not rocket surgery to get the whole thing right (including knowing if the card is a visa or amex or whatever, so don't ask the user!)

It's amazing the number of people who it never occurs to to try and get it right. I've just been asked to implement a check on lorry driver hours, and my FIRST reaction was to ask the requester to send me a copy of the appropriate regulations.

I think this will be the third regulation I've implemented a check for - the previous one was "a driver cannot work for more than 13 calendar days without a 45hr break". The guy who implemented the previous version of the check thought all he needed to look at was the previous, current and next calendar week. How can you check a 13-day rule if at the start of the week you can only see back 8 days! And he just couldn't see what was wrong with it.

Cheers,
Wol

Badly coded websites

Posted Sep 14, 2024 16:27 UTC (Sat) by farnz (subscriber, #17727) [Link] (3 responses)

The trouble is that, while I've encountered cases that are the payment card industry's fault (Amex CV2 being 4 digits and in a different place on the card, some cards having a 19 digit PAN), I've also encountered a lot that add complexity (like "CV2 can't be interpreted as a date in the format M/YY"). And while I get the temptation to simplify in ways that you think don't matter, adding complexity because you assume things outside the spec doesn't make a huge amount of sense to me.

Badly coded websites

Posted Oct 2, 2024 8:22 UTC (Wed) by taladar (subscriber, #68407) [Link] (2 responses)

> "CV2 can't be interpreted as a date in the format M/YY"

Wouldn't that break for the last three months of every year? Not to mention that 4 digit years should really be used for everything by now.

Badly coded websites

Posted Oct 2, 2024 10:12 UTC (Wed) by farnz (subscriber, #17727) [Link]

The CV2 is three digits; I had a CV2 (at the time) of 821, and the site rejected this on the grounds that this "must" be the issue date of my card, not the CV2.

I assume that they were looking at the last 2 digits, and deciding if it looked "too close" to a plausible expiry or issue year in order to determine if the CV2 was valid, but the error message told me that it was rejecting it because I'd supplied the issue date, not the CV2. Experimentally, it also rejected 828 as my card expiry date (even though I had put in the actual expiry date elsewhere in the form).

Badly coded websites

Posted Oct 2, 2024 10:34 UTC (Wed) by Wol (subscriber, #4433) [Link]

Actually, I would have thought it would break for every CV2 >= 100. M can't be 0, but YY could be anything.

And why use YYYY for user interaction when all dates are "about now"? Credit card dates are all within a 10-yr windows, so YY is not ambiguous.

Cheers,
Wol