Regular expressions (RegEx for short) are special strings that define patterns for matching specific sets of strings. RegEx are a favorite interview question for many developers as it allows you to quickly quiz an interviewee’s ability to decode a problem into smaller parts without needing to write a lot of code.
There are some excellent online tools available to test your regular expression syntax and match. http://regexpal.com/ is an interesting one to play around with. TextMate on Mac and Notepad++ are good alternatives from a desktop perspective.
In this post, we will review some of the basic regular expressions. In future posts, we will look into constructing more complex patterns.
Question: Develop a regular expression to match a US phone number.
Let us take an example phone number – 425-882-8080 (this also happens to be Microsoft’s main line number so don’t call it unless you absolutely have to ).
- The simplest RegEx pattern for this can be the number itself. Yes, that works too. But I don’t think the interviewer would be very happy if you give her this answer.
- Using character classes or sets, we can match a group of characters with or without specifying all of them. For example, [0-9] tells the processor to match any digit in the range of 0 to 9. The square brackets are not literally matched because they are treated specially as meta-characters. A meta-character has special meaning in regular expressions and is reserved. A regular expression in the form [0-9] is called a character class, or sometimes
a character set. In addition, you can be more specific and specify the digits you want matched. For example, [02468] only matches if the input contains one of 0, 2, 4, 6 or 8. As a next step solution for our problem using character classes, the following RegEx will work (but don’t tell the interviewer that this is your final solution yet): [0-9][0-9][0-9]-[0-9][0-9][0-9]-[0-9][0-9][0-9][0-9] - Let’s bump up our skills now by using character shorthand. A \d matches any digit. A \D matches any non-digit. Our answer can now be shortened to \d\d\d-\d\d\d-\d\d\d\d (which does an exact match for hyphen(-) or better yet to \d\d\d\D\d\d\d\D\d\d\d\d. Note that instead of using a \D, we could have used a dot (.) which allows you to match to any character.
- Note that wrapping a part of a regular expression in parentheses () creates a group. We will learn more about groups in a future post.
- To shorten our RegEx more, we can enlist the help of Quantifiers. As the name suggests, quantifiers allow you to specify how many times the preceding expression should match. There are a number of ways to specify a quantifier – \d{3} implies match a digit exactly 3 times; The question mark (?) signifies zero or one; plus sign (+), which means one or more, or the asterisk (*) which means zero or more. Given our new knowledge about quantifiers, our answer can be updated to (\d{3}[-]?){2}\d{4} which will match two non-parenthesized sequences of three digits each, followed by an optional hyphen, and then followed by exactly four digits.
- We are almost there. It is now time to make our RegEx more robust, professional, smart and production ready. Let’s add the following features:
- The area code can be optional
- allow literal parentheses to optionally wrap the first sequence of three digits
- The separator character can either be a dot (.) or a hyphen (-)
Our final answer that should be good enough for an interview to match a 10-digit, US phone number, with or without parentheses, hyphens, or dots and optional 3 digit area code can be ^(\(\d{3}\)|^\d{3}[.-]?)?\d{3}[.-]?\d{4}$
Let’s dissect this RegEx on a character by character basis to make sure we are on the right track:
- ^ (caret) at the beginning of the regular expression, or following the vertical bar (|), means that the phone number will be at the beginning of a line.
- ( opens a group.
- \( is a literal open parenthesis.
- \d matches a digit.
- {3} is a quantifier that, following \d, matches exactly three digits.
- \) matches a literal close parenthesis.
- | (the vertical bar) indicates alternation, that is, a given choice of alternatives. In other words, this says “match an area code with parentheses or without them.”
- ^ matches the beginning of a line.
- \d matches a digit.
- {3} is a quantifier that matches exactly three digits.
- [.-]? matches an optional dot or hyphen.
- ) close capturing group.
- ? make the group optional, that is, the prefix in the group is not required.
- \d matches a digit.
- {3} matches exactly three digits.
- [.-]? matches another optional dot or hyphen.
- \d matches a digit.
- {4} matches exactly four digits.
- $ matches the end of a line.
In future posts, we will look at some more advanced regular expressions with examples.
HI,
ReplyDeleteThis is very helpful. Very nicely explained. I have a question for the last step. Seems like you divided the first 3 digits in two groups separated for checking with parenthesis or without parenthesis.
Going by your earlier logic we can check that using a ? so that will indicate 0 or 1 parenthesis instead of making two separate groups for 425 or (425)
.
Thanks.
Nads.
Hi Nikhil,
DeleteYour writing shines like gold! There is no room for gibberish here clearly. You nailed it in Regular expressions interview questions–Part 1!
I will have the mp3 files my customer buys on a WordPress page and a cart will < direct them to that page AWS Training USA . If I want the mp3 files to be downloaded by the customer is there any reason to protect them except to keep them from being indexed by a search engine? Do I need to have a key or do a get operation other than have server-side encryption in S3?
Very useful article, if I run into challenges along the way, I will share them here.
Grazie,
Kevin
(\-(\d)*).* : Though not perfect even it would do..?
ReplyDeleteyes yes'
DeleteThanks its simple and nice
ReplyDeleteOne more wrinkle: US phone numbers cannot have the number 1 in the first or fourth digit. (Don’t believe me? Name an area code that starts with 1.) Handle this case for super-extra bonus points :)
ReplyDeleteNice Explanation....
ReplyDeletenothing use
ReplyDeleteVery nice explanation.
ReplyDeleteThank you, very helpful.
ReplyDeleteThanks for the explanation
ReplyDeleteC# interview questions
Why do you need the second "^"? isn't the ^ before the parenthesis enough? Also doesn't ^ also mean negation? if so then how I distinguish between the two different meanings in a safe way? thanks. Very good article
ReplyDelete((\d{3}[.-])*){1,2}\d{4}
ReplyDelete^1?[.- ]?\(?[0-9]{3}\)[.- ]?[0-9]{3}[.- ]?[0-9]{4}
ReplyDeletei think this is pretty comprehensive but not fool proof
Hi There,
ReplyDeleteGasping at your brilliance! Thanks a tonne for sharing all that content. Can’t stop reading. Honestly!
I am searching for a Java Api which can validate boolean expressions.
For example:
This is the rule.... ((A & B) | C) and I have a set of codes that should be validated:
So the code C should return true and so on....
Do anybody know any API?
I am so grateful for your blog. Really looking forward to
read more.
Regards,
Morgan
How about this? "(([\(]?\d{3}[\)]?).)?(\d{3}).(\d{4})"
ReplyDeleteHello There,
ReplyDeleteHot! That was HOT! Glued to the Programming Interview Questions and Answers your proficiency and style!x
I am trying to load a few lines ( many strings separated by a space) from a text file and break them into string tokens and store it as structure fields. This functions should be performed by the load items(item); function.
However there is an anomaly. When I print the structure fields to check if they have been loaded properly, it turns out they are not!. when I print structure fields outside the load items(item); function the fields do not seem to be stored properly in the array.
Thank you very much and will look for more postings from you.
Many Thanks,
Tina
Hi There,
ReplyDeleteI am shocked, shocked, that there is such article exist!! But I really think you did a great job highlighting some of the key Regular expressions interview questions–Part 1 in the entire space.
We were experimenting with AWS and somehow linked existing accounts . If I click the button to say close account then I get a message stating:
I look forward to see your next updates.
Merci,
Preethi.
Hi Mate,
ReplyDeleteYour writing shines like gold! There is no room for gibberish here clearly. You nailed it in Regular expressions interview questions–Part 1
I can not connect to private accessible RDS instance after rebooting EC2 instance. So far everything worked ok for couple of months and I rebooted EC2 instance many times. When I switch RDS instance to be publicly available there is no problem with connection. AWS Training USA
Thank you very much and will look for more postings from you.
Obrigado,
Ajeeth
Aloha,
ReplyDeleteA really interesting, clear and easily readable Regular expressions interview questions–Part 1 article of interesting and different perspectives.I will clap. So much is so well covered here.
I found it pretty much incomprehensible and unusable. It is ten times too fussy and complicated. Things pop up & down & sideways like a mad thing. AWS Training USA
Somehow you have to "add" your latest entries to some incremental score or total. I could not understand it.
THANK YOU!! This saved my butt today, I’m immensely grateful.
Thanks a heaps,
Ajeeth
I'm happy about everything you bring it very interesting and helpful
ReplyDeleteAgen Bola Terpercaya
Agen Casino Resmi
Judi Sbobet Indonesia
Pleasure too visit this site, it's amazing to me
ReplyDeleteHow about join us too at http://pokernet88.co/
Agen Poker Online
Judi Poker IDN
Agen Poker Indonesia
Your article shining like a gold, so interresting
ReplyDeleteJudi Ceme Online
Agen Ceme Online
IDN Poker Online
My spouse and I stumbled over here coming from a different web address and thought I might as well check things out. I like what I see so i am just following you.
ReplyDeleteLook forward to looking into your web page again.
IDN Poker
freebet Poker
game kartu poker
daftar kartu poker
domino qiu qiu
panduan menang poker
Whilst looking for a charging spot for my electric car it came to me. When it comes to buying seafood, it just simply isn’t worth trying to save a few pounds choosing dredged over hand-caught scallops. Add that to the list of promises Obama has broken.
ReplyDeleteCfb8
sbobetasia
S12888
Sabung ayam
Adu ayam
Adu ayam bangkok
Sabung ayam bangkok
Sabung ayam taji
Sabung ayam online
Love to read it,Waiting For More new Update and I Already Read your Recent Post its Great Thanks.
ReplyDeletegame android terbaik
game pc terbaik
Aivivu - đại lý chuyên vé máy bay trong nước và quốc tế
ReplyDeleteVé máy bay đi Mỹ
ve may bay từ mỹ về việt nam hãng eva
giá vé máy bay từ Vancouver về việt nam
mua vé từ nhật về việt nam
bay từ hàn quốc về việt nam
Vé máy bay từ Đài Loan về VN
danh sách khách sạn cách ly tại tphcm
chuyen bay chuyen gia trung quoc
nice article
ReplyDeleteidn poker
ReplyDeleteA Nice post!
https://justcrackpc.com/
ESET NOD32 Antivirus crack
Duplicate File Detective Enterprise crack
Regex Buddy crack
TriSun PDF to Text crack
I know this is quality based blogs along with other stuff.
ReplyDeleteBay Area design firm
ReplyDeleteSuch a Nice post. Thanks for Awesome tips Keep it up
regexbuddy-crack
windows-repair-pro-crack
windows-10-pro-activator
sony vegas pro
manycam pro full crack
This Software is easily understandable. I like it. All information are valuable for us. keep shearing!
ReplyDeleteSee Hear
First time I visit your website and I really impressed by your writing. Must say you have great writing skill. I will visit again you website for new updates. You can also visit my website, if you need any technical help to resolve your email related issue like how to Change AT&T Password.
ReplyDelete