Clean up Attributed String

If you simply want to match any <font …> tag, you can use "<font [^>]++>".

For any <font …> tag containing a ‘color=#hhhhhh’ element: "<font [^>]*color=#[[:hex:]]{6}[^>]*+>".

For any <font …> tag specifically ending with such an element: "<font [^>]*color=#[[:hex:]]{6}>".

Many thanks!

This make totally sense.
Some question though: where would I find that I can use something like


That’s really odd! :face_with_raised_eyebrow: I used [[:hex:]] from memory and it worked (and still works) when tested. I thought it was one of the POSIX bracket expressions, which are listed in man grep and on the site and are recognised by ICU regex. But it’s not one of those. The bracket expression for a hexadecimal digit in those places is [[:xdigit:]], which also works here. I must have seen [[:hex:]] somewhere in the past, but I don’t know where now. :confused: Apologies if I’ve given you something which isn’t officially supported. If in doubt, you can use [0-9A-Fa-f] instead.

It’s a legitimate alias for Hex_Digit. See:

In non-POSIX terms, you would use \p{Hex_Digit}.

Thanks, Shane.

So taken in conjunction with what it says in the ICU Regex Guide, it’s an alternate [sic] POSIX-like syntax for a set expression matching an acceptable abbreviation of the Unicode category “Hex_Digit”? What a stroke of luck! :wink:

And the syntax is only POSIX-like. It doesn’t necessary have to be in a character set as it would be in a shell script. So [[:hex:]] can be just [:hex:] here. We’re certainly spoiled for choice!

"[:hex:]" or "[[:hex:]]" (“hex” can be written in any combination of cases)
"[:hex_digit:]" (simile)
"[:xdigit:]" (simile)
"\\p{hex}" (simile)
"\\p{hex_digit}" (simile)
"\\p{xdigit}" (simile)
"[0-9A-Fa-f]" or "[[0-9][A-F][a-f]]" (case sensitive, but both cases are covered)
"(?i:[0-9a-f])" or "[[0-9](?i:[a-f])]" (case insensitive, but the cases in the given letter range must be the same)
"[\\d(?i:[a-f])]" or "[\\p{nd}(?i:[a-f])]" or "[\\p{number}(?i:[a-f])]" or "[\\p{decimal number}(?i:[a-f])]" or "(?:\\d|(?i:[a-f]))"etc.!

The perfect ingredients for write-only code :hole:

From "[[0-9][A-F][a-f]]" on, certainly. The options before that seem pretty self-explanatory and allow writers to use what they happen to know. Even some of the later stuff may make sense in a broader context.

No question. It’s the reader I was thinking about.

Well, all in in all a real detailed answer :wink: