看板 rikaka
作者 標題 regular expression
時間 2012年03月22日 Thu. AM 02:37:56
付費軟體: http://www.regexbuddy.com/regex.html
http://regex.learncodethehardway.org/book/
**php5 power programming chap. 9
------------------------------
http://www.hkcode.com/programming/581
1. 密碼檢查: 以下的 Regular Expression 會檢查密碼是否夠安全,會檢查密碼必需最少有 8 位,以及最少包括一個小寫字母、一個大寫字母及一個數字:
2. Email 地址檢查:
3. URL 檢查:
------------------------------
if (preg_match("/^.*(?=.{8,})(?=.*\d)(?=.*[a-z])(?=.*[A-Z]).*$/", $password))
$regexp = "/^[^0-9][A-z0-9_]+([.][A-z0-9_]+)*[@][A-z0-9_]+([.][A-z0-9_]+)*[.][A-z]{2,4}$/";
if (!preg_match("#^http(s)?://[a-z0-9-_.]+\.[a-z]{2,4}#i",$websiteUrl)) {
------------------------------
給你一行文字 $string,你會如何編寫一個正規表達式,把 $string 內的 HTML 標籤除去?
首先,PHP 有內建函式 strip_tags() 除去 HTML 標籤,為何要自行編寫正規表達式?好了,便當作是面試的一道考題吧,我會這樣回答:
$stringOfText = "<p>This is a test</p>";
$expression = "/<(.*?)>(.*?)<\/(.*?)>/";
echo preg_replace($expression, "\\2", $stringOfText);
// 有人說也可以使用 /(<[^>]*>)/ --->(<[^>]*>)(.*)(<\/[^>]*>)
$expression = "/(<[^>]*>)/";
echo preg_replace($expression, "", $stringOfText);
-----------------
http://www.regular-expressions.info/repeat.html
Laziness Instead of Greediness
The quick fix to this problem is to make the plus lazy instead of greedy. Lazy quantifiers are sometimes also called "ungreedy" or "reluctant". You can do that by putting a question mark behind the plus in the regex. You can do the same with the star, the curly braces and the question mark itself. So our example becomes <.+?>. Let's have another look inside the regex engine.
Again, < matches the first < in the string. The next token is the dot, this time repeated by a lazy plus. This tells the regex engine to repeat the dot as few times as possible. The minimum is one. So the engine matches the dot with E. The requirement has been met, and the engine continues with > and M. This fails. Again, the engine will backtrack. But this time, the backtracking will force the lazy plus to expand rather than reduce its reach. So the match of .+ is expanded to EM, and the engine tries again to continue with >. Now, > is matched successfully. The last token in the regex has been matched. The engine reports that <EM> has been successfully matched. That's more like it.
An Alternative to Laziness
In this case, there is a better option than making the plus lazy. We can use a greedy plus and a negated character class: <[^>]+>. The reason why this is better is because of the backtracking. When using the lazy plus, the engine has to backtrack for each character in the HTML tag that it is trying to match. When using the negated character class, no backtracking occurs at all when the string contains valid HTML code. Backtracking slows down the regex engine. You will not notice the difference when doing a single search in a text editor. But you will save plenty of CPU cycles when using such a regex repeatedly in a tight loop in a script that you are writing, or perhaps in a custom syntax coloring scheme for EditPad Pro.
Finally, remember that this tutorial only talks about regex-directed engines. Text-directed engines do not backtrack. They do not get the speed penalty, but they also do not support lazy repetition operators.
-----------------http://www.regular-expressions.info/engine.html
First Look at How a Regex Engine Works Internally
Knowing how the regex engine works will enable you to craft better regexes more easily. It will help you understand quickly why a particular regex does not do what you initially expected. This will save you lots of guesswork and head scratching when you need to write more complex regexes.
There are two kinds of regular expression engines: text-directed engines, and regex-directed engines. Jeffrey Friedl calls them DFA and NFA engines, respectively. All the regex flavors treated in this tutorial are based on regex-directed engines. This is because certain very useful features, such as lazy quantifiers and backreferences, can only be implemented in regex-directed engines. No surprise that this kind of engine is more popular.
Notable tools that use text-directed engines are awk, egrep, flex, lex, MySQL and Procmail. For awk and egrep, there are a few versions of these tools that use a regex-directed engine.
You can easily find out whether the regex flavor you intend to use has a text-directed or regex-directed engine. If backreferences and/or lazy quantifiers are available, you can be certain the engine is regex-directed. You can do the test by applying the regex regex|regex not to the string regex not. If the resulting match is only regex, the engine is regex-directed. If the result is regex not, then it is text-directed. The reason behind this is that the regex-directed engine is "eager".
In this tutorial, after introducing a new regex token, I will explain step by step how the regex engine actually processes that token. This inside look may seem a bit long-winded at certain times. But understanding how the regex engine works will enable you to use its full power and help you avoid common mistakes.
-----------------http://www.encntc.edu.tw/document/php4doc/regular/ereg_replace.html
ereg_replace(PHP3 , PHP4)
ereg_replace --- 正規表達比對取代
語法 : string ereg_replace (string pattern, string replacement, string string)
說明 :
此函式掃描string來和pattern比對,然後以replacement來取代比對到的文字。
此函式傳回修改過的字串,如果比對不到則傳回原來的字串。
如果pattern包含括弧的部份字串,則replacement可以包含" \\數字"的部份字串,這將會以第幾個括弧內的部份字串來替代。\\0將會產生整個字串的內容,最多可使用到九個部份字串,這種情況下它會以開啟的括弧來計算。
如果在string中找不到比對,則將會傳回未改變的字串string。
以下的範例會將字串切斷,並顯示三次 "This was a test" :
<?php
$string = "This is a test";
echo ereg_replace (" is", " was", $string);
echo ereg_replace ("( )is", "\\1was", $string);
echo ereg_replace ("(( )is)", "\\2was", $string);
?>
有一個地方需要去注意的是,如果你在參數replacement中使用整數值的時候,你可能無法取得到結果,這是因為ereg_replace( )將會把數字解釋成字元的順序(ordinal)值,並且執行它,例如 :
<?php
/* This will not work as expected. */
$num = 4;
$string = "This string has four words.";
$string = ereg_replace('four', $num, $string);
echo $string; /* Output: 'This string has words.' */
/* This will work. */ $num = '4';
$string = "This string has four words.";
$string = ereg_replace('four', $num, $string);
echo $string; /* Output: 'This string has 4 words.' */
?>
------------------------------
------------------------------
簡單版>>
http://cckk.tw/wordpress/52/
PHP的正規表達式 | C.K. Blog
正規表達式是一種字串的格式比對 ...
正規表達式是一種字串的格式比對 ...
http://atedev.wordpress.com/2007/11/23/正規表示式-regular-expression/
https://developer.mozilla.org/zh_tw/Core_JavaScript_1.5_教學/正規表達式
(雖是js版本但基本規則較詳細)
--
※ 編輯: rikaka 時間: 2012-03-27 18:03:49
※ 看板: rikaka 文章推薦值: 0 目前人氣: 0 累積人氣: 217
回列表(←)
分享