A
A
abrwalk2013-02-10 04:50:15
Regular Expressions
abrwalk, 2013-02-10 04:50:15

Regular expression for domain validation with punycode support?

I'm trying to write a regexp for strict domain validation with support for IDN encoding (xn--).
At the moment, "as it should" only the TLD check works for me:

[az]{2,6}$|(xn--)?[a-z0-9]{4,32}
i.e. checking for regular az, then for punycode (I have omitted the listing of all registered zones so far for simplicity)
But with the left side of the problem ...
Question: how to check a mask in which some characters can only be inside and not next (hyphen dot),
but with the support of xn-- at the beginning or before the dot. and so that each block separated by dots does not exceed the required size ...
for example, it should be false:
-aaaa
aaaa-
aa--a
--aaa
aaa--
.aaaa
aaaa.
..aaa
aa..a
must be true:
xn--aaaaaa
xn--a-aa.a
aa.xn--aa
a.xn--aa.a
xn--a-aa-a
I googled the entire Internet, everything is obvious from the manuals, but I try to build it into the big picture, and nothing works in the end.
And ready-made examples are mostly “childish”, they don’t suit me. It is very important to determine the correctness as accurately as possible in order to avoid DNS queries, where the domain falls after validation with the regexp.
So far, I have settled on the usual checking of occurrences of characters, without taking into account repetitions and the correct length of each block.
Here is my script with a list of test domains (75 "should be false" and 10 "should be true"):
pastebin.com/7GaMDZhQ

Answer the question

In order to leave comments, you need to log in

2 answer(s)
N
Nikita Gusakov, 2013-02-10
@abrwalk

why can't you convert to utf-8 and check?

L
lehha, 2013-02-12
@lehha

/^[0-9a-z][0-9a-z-]{0,62}[0-9a-z]\.xn--p1ai$/
here for RF. BUT! This is just for checking punycode. According to the rules of the Russian Federation, it is also necessary to check the presentation in Cyrillic writing.

Didn't find what you were looking for?

Ask your question

Ask a Question

731 491 924 answers to any question