国内3G网络哪家的好 - 未名空间MITBBS历史存档

国内3G网络哪家的好# PDA - 掌中宝

p*e2012-10-28 07:10

1 楼

1. remove duplicated characters，keep original order
"abcdace" -> "abcde"
"aaaab" -> "ab"
prototype:
char * remove_dup(char * s)
2. implement regular expression match.
we only need to support "." and "*" ("." means any character, and "*" means
repeat previous char 0 or any times), and we need to match the whole input
for example,
match("aa","a") -> false
match("aa","aa") -> true
match("aa", "a*") -> true
match("aa", ".*") -> true
match("aab", "c*a*b") ->true
the prototype is:
bool is_match(const char* s, const char* p)

B*e2012-10-28 07:10

2 楼

【以下文字转载自 Sex 讨论区】
发信人: leonor (滟莲), 信区: Sex
标题: 昨晚蛋疼时发的征聊贴，今天来看成效~
发信站: BBS 未名空间站 (Sun May 8 07:25:41 2011, 美东)
昨晚说，
征聊。私信。MSN秒加。
混了这么长时间，咱终于不厚道了一次，发完就去睡觉了。
今天早上用爪机一看，inbox直接暴掉。待会儿来细数一下。
好友确认，两条。
短信，5条。收到MSN地址2个。
俺这次真是坑了，抱歉了某孩子LOLOL
邮件，20封... 收到MSN地址13个...
最槑的一句是， Hi, nice to meet you...
还有来秒征的...
失策之处是，应该要了照片再来的.....
哈哈哈。。。那就太掉人品了。。。

o*d2012-10-28 07:10

3 楼

tablet用1G data得多少钱

k*j2012-10-28 07:10

4 楼

赞分享。
2题都是经典题目。发现fb非常喜欢考第二题。
http://www.mitbbs.com/article_t/JobHunting/31713505.html

means

【在 p****e 的大作中提到】

: 1. remove duplicated characters，keep original order
: "abcdace" -> "abcde"
: "aaaab" -> "ab"
: prototype:
: char * remove_dup(char * s)
: 2. implement regular expression match.
: we only need to support "." and "*" ("." means any character, and "*" means
: repeat previous char 0 or any times), and we need to match the whole input
: for example,
: match("aa","a") -> false

x*s2012-10-28 07:10

5 楼

寂寞的人很多啊

l*g2012-10-28 07:10

6 楼

你懂的，只有两家，电信和联通

H*y2012-10-28 07:10

7 楼

#2
match("aa",".*")
if "." is "a" and "*" is 0 then
match("aa","a") -> true.
contradicted with the first example: match("aa","a") -> false

means

【在 p****e 的大作中提到】

: 1. remove duplicated characters，keep original order
: "abcdace" -> "abcde"
: "aaaab" -> "ab"
: prototype:
: char * remove_dup(char * s)
: 2. implement regular expression match.
: we only need to support "." and "*" ("." means any character, and "*" means
: repeat previous char 0 or any times), and we need to match the whole input
: for example,
: match("aa","a") -> false

C*g2012-10-28 07:10

8 楼

女的“蛋疼”是哪里疼？

o*d2012-10-28 07:10

9 楼

还有个移动3G
不知哪家的便宜好使。
主要是想给在农村的老父亲买个3G tablet

【在 l********g 的大作中提到】

: 你懂的，只有两家，电信和联通

s*b2012-10-28 07:10

10 楼

为啥扯到DP啊？regex的匹配是lexical analysis常见算法。常见算法里，要么就上
Regex -> NFA/DFA加记忆（也就是说DP最多是优化技巧），要么是Tompson's
algorithm, 要么就是recursive decent。谁说Recursion就一定慢来着？The Practice
of Programming里的mutual recursion对付面试足够了。

【在 k*j 的大作中提到】

: 赞分享。
: 2题都是经典题目。发现fb非常喜欢考第二题。
: http://www.mitbbs.com/article_t/JobHunting/31713505.html
:
: means

y*n2012-10-28 07:10

11 楼

排卵痛。

【在 C********g 的大作中提到】

: 女的“蛋疼”是哪里疼？

c*t2012-10-28 07:10

12 楼

问问第二题中的 “*”：
"*" means repeat previous char 0 or any times
可是在下面的链接中：
'*' (zero of more arbitrary char match).
http://www.mitbbs.com/article_t/JobHunting/31713505.html
到底"*"指的是什么？ "*"作用于它前面的字符？
你的例子：
match("aab", "c*a*b") ->true
我的理解是：
第一个*把它前面的给消掉
第二个*把它前面的a重复两次
结果就是aab了。所以match, 对吗？
谢谢！

means

【在 p****e 的大作中提到】

: 1. remove duplicated characters，keep original order
: "abcdace" -> "abcde"
: "aaaab" -> "ab"
: prototype:
: char * remove_dup(char * s)
: 2. implement regular expression match.
: we only need to support "." and "*" ("." means any character, and "*" means
: repeat previous char 0 or any times), and we need to match the whole input
: for example,
: match("aa","a") -> false

r*e2012-10-28 07:10

13 楼

岂止是寂寞

【在 x**********s 的大作中提到】

: 寂寞的人很多啊

y*g2012-10-28 07:10

14 楼

wildcard* 和regex * 不一样

【在 c*********t 的大作中提到】

: 问问第二题中的 “*”：
: "*" means repeat previous char 0 or any times
: 可是在下面的链接中：
: '*' (zero of more arbitrary char match).
: http://www.mitbbs.com/article_t/JobHunting/31713505.html
: 到底"*"指的是什么？ "*"作用于它前面的字符？
: 你的例子：
: match("aab", "c*a*b") ->true
: 我的理解是：
: 第一个*把它前面的给消掉

c*t2012-10-28 07:10

15 楼

好的，
就是想弄明白这里的
"*" means repeat previous char 0 or any times
到底”repeat previous char 0 time“指的是什么？
是不是把前面的那个字符给删除掉？

【在 y*******g 的大作中提到】

: wildcard* 和regex * 不一样

g*i2012-10-28 07:10

16 楼

这链接里是wildcard,楼主的题目是regex,还是不一样.
这题我觉得用递归可能好点,不用递归的话我觉得难点在于一下这些特殊pattern
c*cba (*前后一样),
c*.ba (*后面是"."),
.* (*前面是".")
c*c*.*c* (同一个字符和"."的*连续出现)
谁能给点思路吗?

【在 k*j 的大作中提到】

: 赞分享。
: 2题都是经典题目。发现fb非常喜欢考第二题。
: http://www.mitbbs.com/article_t/JobHunting/31713505.html
:
: means

f*t2012-10-28 07:10

17 楼

我刚写的，至少例子都能通过。话说我没觉得递归哪里不好啊。。。
bool is_match(const char* s, const char* p)
{
// invalid input
if(s == NULL || p == NULL || p[0] == '*')
return false;

// both string and pattern terminate synchronously
if(s[0] == 0 && p[0] == 0)
{
return true;
}

// pattern has been used up but string does not reach the end
if(p[0] == 0 && s[0] != 0)
{
return false;
}

// test if there is a '*' after current character
bool anyOccur = false;
if(p[0] != 0 && p[1] == '*')
{
anyOccur = true;
}

// string has been used up but pattern does not reach the end
if(s[0] == 0 && p[0] != 0)
{
if(anyOccur) // still has possibility of match
{
return is_match(s, p+2);
}
else // must not match
{
return false;
}
}

// start from now, both string should have available characters to
compare
if(p[0] == '.')
{
// '.' match every character
if(!anyOccur)
{
return is_match(s+1, p+1);
}
else // '.' could have any number of occurences
{
return (is_match(s, p+2) || // just ignore it
is_match(s+1, p)); // continue to match with '.'
}
}
else // match other characters
{
if(!anyOccur) // must be exact match!
{
if(p[0] == s[0])
{
return is_match(s+1, p+1);
}
else
{
return false;
}
}
else
{
if(p[0] == s[0])
{
return (is_match(s+1, p+2) || // match next token in
s and p
is_match(s+1, p) || // match current
token with next character in s
is_match(s, p+2)); // match next token in
p
}
else
{
return is_match(s, p+2); // abandon this token
and still get a chance
}
}
}

return false;
}

s*y2012-10-28 07:10

18 楼

Does your code pass this:
"aa" ".**a*" ?

f*t2012-10-28 07:10

19 楼

"*" means repeat previous char 0 or any times
这样.**第二个*把前面的*重复了无数次，没意义啊。。
为了方便我的程序直接判定这个pattern为invalid

【在 s*****y 的大作中提到】

: Does your code pass this:
: "aa" ".**a*" ?

g*i2012-10-28 07:10

20 楼

我觉得可以,顺便问下c里面不是"\0"才是终止吗?你这里=0也可以判断? 好久不用c有点
忘了

【在 f*******t 的大作中提到】

: 我刚写的，至少例子都能通过。话说我没觉得递归哪里不好啊。。。
: bool is_match(const char* s, const char* p)
: {
: // invalid input
: if(s == NULL || p == NULL || p[0] == '*')
: return false;
:
: // both string and pattern terminate synchronously
: if(s[0] == 0 && p[0] == 0)
: {

f*t2012-10-28 07:10

21 楼

'\0'就是0x00
NULL也是0

【在 g*****i 的大作中提到】

: 我觉得可以,顺便问下c里面不是"\0"才是终止吗?你这里=0也可以判断? 好久不用c有点
: 忘了

g*i2012-10-28 07:10

22 楼

谢谢解释

【在 f*******t 的大作中提到】

: '\0'就是0x00
: NULL也是0

p*e2012-10-28 07:10

23 楼

贴个楼主事后写的：
bool _re_match(const char *str, const char *pattern, char prev_char) {
// 如果str匹配完，检查pattern是否匹配完，或者还剩一个"*"
if (*str == NULL)
return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
)) ? true : false;
if (*pattern != '*')
{
// 如果当前pattern char不是'*', 试图匹配当前的str char, 然后
继续。
if (*pattern == '.' || *pattern == *str)
{
if (_re_match(str+1, pattern+1, *pattern))
return true;
}

// 特殊情况，当前pattern后是一个'*', 考虑丢弃当前的pattern,
再接着匹配
if (*(pattern+1) == '*' && _re_match(str, pattern+2, '*'))
return true;

return false;
}
else if (*pattern == '*')
{
//与前一种类似，用prev_char去匹配当前char,或者丢弃prev_char后接着尝试匹配
if ((prev_char == '.' || prev_char == *str) && _re_match(str+1,
pattern, prev_char))
return true;
if (_re_match(str, pattern+1, '*'))
return true;
return false;
}
_ASSERT(false);
return false;
}
bool re_match(const char *str, const char *pattern)
{
return _re_match(str, pattern, 0);
}
感觉在递归中引入一个prev_char后，逻辑要更简单些，下面是我用的一些test case:
cout << re_match("aaa", "aaa") << endl;
cout << re_match("aaa", "aa") << endl;
cout << re_match("aaa", "aaaa") << endl;
cout << re_match("aaa", "a.a") << endl;
cout << re_match("aaa", ".a") << endl;
cout << re_match("aaa", "a*a") << endl;
cout << re_match("aaa", "ab*a") << endl;
cout << re_match("aaa", "ab*ac*a") << endl;
cout << re_match("aaa", "ab*a*c*a") << endl;
cout << re_match("aaca", "ab*a*c*a") << endl;
cout << re_match("aaba", "ab*a*c*a") << endl;
cout << re_match("aaa", ".*") << endl;
感想：
第一次电面F, 事先了解的情况是，题目算法不会难，但是coding的要求比较高，需
要一次写出漂亮无明显bug的代码才能过。

个人感觉是，regular expression这道题确实算法不难，但是事先完全没接触过这个
题目（因为觉的要用state machine, 面试不可能考）。当时一下子有点懵住的感觉。
然后我问面试官是否要用state machine类的算法，他说当然可以，但是你可以试试
recursive. 后来我就写了个类似上面的recursive的解，然后我自己修改一些明显的错
误，然后他又指出一些问题，然后他貌似比较满意了，这道题就OK了。
电面刚完几分钟，hr就来信了，给了一些feedback然后安排了下一次电面。
今天在电脑上重写了一下这个题，发现我面试时写的还是有不少bug的，比如当match
("aa","a*aa")这样的串的时候，没有考虑完全跳过"a*"的情况。所以说从个人经历来
讲，也不是一定要一次写出无bug的代码，重要的是在与面试官的交流过程中表现出你
的coding skills就可以了

p*e2012-10-28 07:10

24 楼

另外，也写了下wildchar的match, 要比regular expression简洁许多(思路与上面类似
）：
bool wildchar_match(const char *str, const char *pattern)
{
if (*str == NULL)
return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
)) ? true : false;
if (*pattern != '*')
return (*pattern == '.' || *pattern == *str) ? wildchar_match(str+1,
pattern+1) : false;
else
return wildchar_match(str+1, pattern) ? true : wildchar_match(str,
pattern+1);
}

i*e2012-10-28 07:10

25 楼

我刚找到你代码有个 bug，请试一下这个 case:
isMatch("a", "ab*") -> 这个应该返回的是 true.
我尝试写这个递归程序的时候也犯了跟你同样的错误了。
我的代码如下，感觉跟你思路挺像，但是我的多了个 while loop. （那个 do-while
loop 不用管它，只是跳掉多余的 '*'s).
bool isMatch(const char *s, const char *p) {
assert(s && p);
if (*p == '\0') return *s == '\0';
// next char is not '*': must match current character
if (*(p+1) != '*') {
assert(*p != '*');
return ((*p == *s) || (*p == '.' && *s != '\0')) && isMatch(s+1, p+1);
}
// next char is '*'
char rep = *p; // repeated char
// skips extra '*'s
do {
p++;
} while (*p == '*');
while ((rep == '.' && *s != '\0') || *s == rep) {
if (isMatch(s, p)) return true;
s++;
}
return isMatch(s, p);
}

NULL

【在 p****e 的大作中提到】

: 贴个楼主事后写的：
: bool _re_match(const char *str, const char *pattern, char prev_char) {
: // 如果str匹配完，检查pattern是否匹配完，或者还剩一个"*"
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: {
: // 如果当前pattern char不是'*', 试图匹配当前的str char, 然后
: 继续。

i*e2012-10-28 07:10

26 楼

还有一个不确定是不是 bug, 还是我理解有问题:
isMatch("ab", ".*") -> 我觉得返回的应该是 false. 你们的代码返回的都是 true.
我的理解是：
.* 这里指的是重复任意字母0或多次。而任意字母在以上例子只能填 'a'，没有其它选
择。

p*e2012-10-28 07:10

27 楼

谢谢指出bug啊，应该在第一个判断有没有匹配完的case中也检查下b*完全跳过的情况
。。。
is_match("ab", ".*")应该返回true, 我当时向面试官确认过。而且现实中的regex中
的".*"是可以匹配任意字符串的。这里应该是认为"*"的优先级比"."要高，所以".*"可
以当成的"",".","..","..." ...中的任意一个。

【在 i**********e 的大作中提到】

: 还有一个不确定是不是 bug, 还是我理解有问题:
: isMatch("ab", ".*") -> 我觉得返回的应该是 false. 你们的代码返回的都是 true.
: 我的理解是：
: .* 这里指的是重复任意字母0或多次。而任意字母在以上例子只能填 'a'，没有其它选
: 择。

i*e2012-10-28 07:10

28 楼

多谢！
明白了，你这样说似乎很有道理。
while loop 加多一个判断条件，搞定！

【在 p****e 的大作中提到】

: 谢谢指出bug啊，应该在第一个判断有没有匹配完的case中也检查下b*完全跳过的情况
: 。。。
: is_match("ab", ".*")应该返回true, 我当时向面试官确认过。而且现实中的regex中
: 的".*"是可以匹配任意字符串的。这里应该是认为"*"的优先级比"."要高，所以".*"可
: 以当成的"",".","..","..." ...中的任意一个。

i*e2012-10-28 07:10

29 楼

我一开始也想多传一个参数进去，但是其实直接看下个字符就可以了。
1）如果下个字符是 '*' 的话，那此字符必须 match pattern 字符。
2）如果下个字符不是 '*' 的话，那就做 brute force matching，一个一个 match 直
到字符与所 matched 字符不一样为止。

NULL

【在 p****e 的大作中提到】

: 贴个楼主事后写的：
: bool _re_match(const char *str, const char *pattern, char prev_char) {
: // 如果str匹配完，检查pattern是否匹配完，或者还剩一个"*"
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: {
: // 如果当前pattern char不是'*', 试图匹配当前的str char, 然后
: 继续。

p*e2012-10-28 07:10

30 楼

:), 看了你的代码我也发现了，这样写简单很多。。呵呵

【在 i**********e 的大作中提到】

: 我一开始也想多传一个参数进去，但是其实直接看下个字符就可以了。
: 1）如果下个字符是 '*' 的话，那此字符必须 match pattern 字符。
: 2）如果下个字符不是 '*' 的话，那就做 brute force matching，一个一个 match 直
: 到字符与所 matched 字符不一样为止。
:
: NULL

s*x2012-10-28 07:10

31 楼

这个 ihas1337code 大侠曾提到过非递归算法，
http://drdobbs.com/cpp/210200888
昨天晚上测试发现，他对 "aaab" 和 "*aab" 的match 就不对，原因是他总是回溯
patten string for no match case, 对原串回溯不够。
in this case, when it sees "aaa" and "aab" does not match, it will try to
match "ab" and "aab", which is not correct.
for the source string, we should advance one char by one char like you did
when a no match is found.

NULL
1,

【在 p****e 的大作中提到】

: 另外，也写了下wildchar的match, 要比regular expression简洁许多(思路与上面类似
: ）：
: bool wildchar_match(const char *str, const char *pattern)
: {
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: return (*pattern == '.' || *pattern == *str) ? wildchar_match(str+1,
: pattern+1) : false;

x*n2012-10-28 07:10

32 楼

第二题用栈来解决好像很容易。不过可能跟上面说的递归的算法是一样的。
public class patternMatch {
static java.util.Stack stackS = new java.util.Stack>();
static java.util.Stack stackP = new java.util.Stack>();
static void initStack( char [] s, char [] p ) {
for( char s1 : s ) {
stackS.add(s1);
}
for( char p1 : p ) {
stackP.add(p1);
}
}
static boolean is_match_stack( char [] s, char [] p) {
while( (!stackS.empty()) && (!stackP.empty()) ) {
if( stackP.peek() == '.' ) {
stackP.pop();
stackS.pop();
continue;
}
if( stackP.peek() == stackS.peek() ) {
stackP.pop();
stackS.pop();
continue;
}
if( stackP.peek() == '*' ) {
stackP.pop();
char tmps = stackS.pop();
char tmpp = stackP.pop();
if( tmpp != tmps && tmpp != '.' ) {
if( stackP.pop() != tmps ) {
return false;
}
else {
//'*' is passed
continue;
}
}
//tmpp == tmps
while( !stackP.empty() && stackP.peek() == tmpp ) {
stackP.pop();
}
while( !stackS.empty() && stackS.peek() == tmps ) {
stackS.pop();
}
}
}
if( !stackS.empty() ) return false;
while( !stackP.empty() ) {
if( stackP.pop() == '*' ) {
if( stackP.peek() != '*' ) {
stackP.pop();
}
}
else {
return false;
}
}
return true;
}
public static void main( String [] args ) {
char [] s = { 's','s','s', 'w', 'a', 'b', 'r' };
char [] p = { 'a','*','a','*', 's', '*', '.','a', 'b', 'r','*' };
initStack( s, p );
System.out.println(s);
System.out.println(p);
System.out.println( is_match_stack(s,p) );
}
}

p*e2012-10-28 07:10

33 楼

1. remove duplicated characters，keep original order
"abcdace" -> "abcde"
"aaaab" -> "ab"
prototype:
char * remove_dup(char * s)
2. implement regular expression match.
we only need to support "." and "*" ("." means any character, and "*" means
repeat previous char 0 or any times), and we need to match the whole input
for example,
match("aa","a") -> false
match("aa","aa") -> true
match("aa", "a*") -> true
match("aa", ".*") -> true
match("aab", "c*a*b") ->true
the prototype is:
bool is_match(const char* s, const char* p)

k*j2012-10-28 07:10

34 楼

赞分享。
2题都是经典题目。发现fb非常喜欢考第二题。
http://www.mitbbs.com/article_t/JobHunting/31713505.html

means

【在 p****e 的大作中提到】

: 1. remove duplicated characters，keep original order
: "abcdace" -> "abcde"
: "aaaab" -> "ab"
: prototype:
: char * remove_dup(char * s)
: 2. implement regular expression match.
: we only need to support "." and "*" ("." means any character, and "*" means
: repeat previous char 0 or any times), and we need to match the whole input
: for example,
: match("aa","a") -> false

H*y2012-10-28 07:10

35 楼

#2
match("aa",".*")
if "." is "a" and "*" is 0 then
match("aa","a") -> true.
contradicted with the first example: match("aa","a") -> false

means

【在 p****e 的大作中提到】

: 1. remove duplicated characters，keep original order
: "abcdace" -> "abcde"
: "aaaab" -> "ab"
: prototype:
: char * remove_dup(char * s)
: 2. implement regular expression match.
: we only need to support "." and "*" ("." means any character, and "*" means
: repeat previous char 0 or any times), and we need to match the whole input
: for example,
: match("aa","a") -> false

s*b2012-10-28 07:10

36 楼

为啥扯到DP啊？regex的匹配是lexical analysis常见算法。常见算法里，要么就上
Regex -> NFA/DFA加记忆（也就是说DP最多是优化技巧），要么是Tompson's
algorithm, 要么就是recursive decent。谁说Recursion就一定慢来着？The Practice
of Programming里的mutual recursion对付面试足够了。

【在 k*j 的大作中提到】

: 赞分享。
: 2题都是经典题目。发现fb非常喜欢考第二题。
: http://www.mitbbs.com/article_t/JobHunting/31713505.html
:
: means

c*t2012-10-28 07:10

37 楼

问问第二题中的 “*”：
"*" means repeat previous char 0 or any times
可是在下面的链接中：
'*' (zero of more arbitrary char match).
http://www.mitbbs.com/article_t/JobHunting/31713505.html
到底"*"指的是什么？ "*"作用于它前面的字符？
你的例子：
match("aab", "c*a*b") ->true
我的理解是：
第一个*把它前面的给消掉
第二个*把它前面的a重复两次
结果就是aab了。所以match, 对吗？
谢谢！

means

【在 p****e 的大作中提到】

: 1. remove duplicated characters，keep original order
: "abcdace" -> "abcde"
: "aaaab" -> "ab"
: prototype:
: char * remove_dup(char * s)
: 2. implement regular expression match.
: we only need to support "." and "*" ("." means any character, and "*" means
: repeat previous char 0 or any times), and we need to match the whole input
: for example,
: match("aa","a") -> false

y*g2012-10-28 07:10

38 楼

wildcard* 和regex * 不一样

【在 c*********t 的大作中提到】

: 问问第二题中的 “*”：
: "*" means repeat previous char 0 or any times
: 可是在下面的链接中：
: '*' (zero of more arbitrary char match).
: http://www.mitbbs.com/article_t/JobHunting/31713505.html
: 到底"*"指的是什么？ "*"作用于它前面的字符？
: 你的例子：
: match("aab", "c*a*b") ->true
: 我的理解是：
: 第一个*把它前面的给消掉

c*t2012-10-28 07:10

39 楼

好的，
就是想弄明白这里的
"*" means repeat previous char 0 or any times
到底”repeat previous char 0 time“指的是什么？
是不是把前面的那个字符给删除掉？

【在 y*******g 的大作中提到】

: wildcard* 和regex * 不一样

g*i2012-10-28 07:10

40 楼

这链接里是wildcard,楼主的题目是regex,还是不一样.
这题我觉得用递归可能好点,不用递归的话我觉得难点在于一下这些特殊pattern
c*cba (*前后一样),
c*.ba (*后面是"."),
.* (*前面是".")
c*c*.*c* (同一个字符和"."的*连续出现)
谁能给点思路吗?

【在 k*j 的大作中提到】

: 赞分享。
: 2题都是经典题目。发现fb非常喜欢考第二题。
: http://www.mitbbs.com/article_t/JobHunting/31713505.html
:
: means

f*t2012-10-28 07:10

41 楼

我刚写的，至少例子都能通过。话说我没觉得递归哪里不好啊。。。
bool is_match(const char* s, const char* p)
{
// invalid input
if(s == NULL || p == NULL || p[0] == '*')
return false;

// both string and pattern terminate synchronously
if(s[0] == 0 && p[0] == 0)
{
return true;
}

// pattern has been used up but string does not reach the end
if(p[0] == 0 && s[0] != 0)
{
return false;
}

// test if there is a '*' after current character
bool anyOccur = false;
if(p[0] != 0 && p[1] == '*')
{
anyOccur = true;
}

// string has been used up but pattern does not reach the end
if(s[0] == 0 && p[0] != 0)
{
if(anyOccur) // still has possibility of match
{
return is_match(s, p+2);
}
else // must not match
{
return false;
}
}

// start from now, both string should have available characters to
compare
if(p[0] == '.')
{
// '.' match every character
if(!anyOccur)
{
return is_match(s+1, p+1);
}
else // '.' could have any number of occurences
{
return (is_match(s, p+2) || // just ignore it
is_match(s+1, p)); // continue to match with '.'
}
}
else // match other characters
{
if(!anyOccur) // must be exact match!
{
if(p[0] == s[0])
{
return is_match(s+1, p+1);
}
else
{
return false;
}
}
else
{
if(p[0] == s[0])
{
return (is_match(s+1, p+2) || // match next token in
s and p
is_match(s+1, p) || // match current
token with next character in s
is_match(s, p+2)); // match next token in
p
}
else
{
return is_match(s, p+2); // abandon this token
and still get a chance
}
}
}

return false;
}

s*y2012-10-28 07:10

42 楼

Does your code pass this:
"aa" ".**a*" ?

f*t2012-10-28 07:10

43 楼

"*" means repeat previous char 0 or any times
这样.**第二个*把前面的*重复了无数次，没意义啊。。
为了方便我的程序直接判定这个pattern为invalid

【在 s*****y 的大作中提到】

: Does your code pass this:
: "aa" ".**a*" ?

g*i2012-10-28 07:10

44 楼

我觉得可以,顺便问下c里面不是"\0"才是终止吗?你这里=0也可以判断? 好久不用c有点
忘了

【在 f*******t 的大作中提到】

: 我刚写的，至少例子都能通过。话说我没觉得递归哪里不好啊。。。
: bool is_match(const char* s, const char* p)
: {
: // invalid input
: if(s == NULL || p == NULL || p[0] == '*')
: return false;
:
: // both string and pattern terminate synchronously
: if(s[0] == 0 && p[0] == 0)
: {

f*t2012-10-28 07:10

45 楼

'\0'就是0x00
NULL也是0

【在 g*****i 的大作中提到】

: 我觉得可以,顺便问下c里面不是"\0"才是终止吗?你这里=0也可以判断? 好久不用c有点
: 忘了

g*i2012-10-28 07:10

46 楼

谢谢解释

【在 f*******t 的大作中提到】

: '\0'就是0x00
: NULL也是0

p*e2012-10-28 07:10

47 楼

贴个楼主事后写的：
bool _re_match(const char *str, const char *pattern, char prev_char) {
// 如果str匹配完，检查pattern是否匹配完，或者还剩一个"*"
if (*str == NULL)
return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
)) ? true : false;
if (*pattern != '*')
{
// 如果当前pattern char不是'*', 试图匹配当前的str char, 然后
继续。
if (*pattern == '.' || *pattern == *str)
{
if (_re_match(str+1, pattern+1, *pattern))
return true;
}

// 特殊情况，当前pattern后是一个'*', 考虑丢弃当前的pattern,
再接着匹配
if (*(pattern+1) == '*' && _re_match(str, pattern+2, '*'))
return true;

return false;
}
else if (*pattern == '*')
{
//与前一种类似，用prev_char去匹配当前char,或者丢弃prev_char后接着尝试匹配
if ((prev_char == '.' || prev_char == *str) && _re_match(str+1,
pattern, prev_char))
return true;
if (_re_match(str, pattern+1, '*'))
return true;
return false;
}
_ASSERT(false);
return false;
}
bool re_match(const char *str, const char *pattern)
{
return _re_match(str, pattern, 0);
}
感觉在递归中引入一个prev_char后，逻辑要更简单些，下面是我用的一些test case:
cout << re_match("aaa", "aaa") << endl;
cout << re_match("aaa", "aa") << endl;
cout << re_match("aaa", "aaaa") << endl;
cout << re_match("aaa", "a.a") << endl;
cout << re_match("aaa", ".a") << endl;
cout << re_match("aaa", "a*a") << endl;
cout << re_match("aaa", "ab*a") << endl;
cout << re_match("aaa", "ab*ac*a") << endl;
cout << re_match("aaa", "ab*a*c*a") << endl;
cout << re_match("aaca", "ab*a*c*a") << endl;
cout << re_match("aaba", "ab*a*c*a") << endl;
cout << re_match("aaa", ".*") << endl;
感想：
第一次电面F, 事先了解的情况是，题目算法不会难，但是coding的要求比较高，需
要一次写出漂亮无明显bug的代码才能过。

个人感觉是，regular expression这道题确实算法不难，但是事先完全没接触过这个
题目（因为觉的要用state machine, 面试不可能考）。当时一下子有点懵住的感觉。
然后我问面试官是否要用state machine类的算法，他说当然可以，但是你可以试试
recursive. 后来我就写了个类似上面的recursive的解，然后我自己修改一些明显的错
误，然后他又指出一些问题，然后他貌似比较满意了，这道题就OK了。
电面刚完几分钟，hr就来信了，给了一些feedback然后安排了下一次电面。
今天在电脑上重写了一下这个题，发现我面试时写的还是有不少bug的，比如当match
("aa","a*aa")这样的串的时候，没有考虑完全跳过"a*"的情况。所以说从个人经历来
讲，也不是一定要一次写出无bug的代码，重要的是在与面试官的交流过程中表现出你
的coding skills就可以了

p*e2012-10-28 07:10

48 楼

另外，也写了下wildchar的match, 要比regular expression简洁许多(思路与上面类似
）：
bool wildchar_match(const char *str, const char *pattern)
{
if (*str == NULL)
return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
)) ? true : false;
if (*pattern != '*')
return (*pattern == '.' || *pattern == *str) ? wildchar_match(str+1,
pattern+1) : false;
else
return wildchar_match(str+1, pattern) ? true : wildchar_match(str,
pattern+1);
}

i*e2012-10-28 07:10

49 楼

我刚找到你代码有个 bug，请试一下这个 case:
isMatch("a", "ab*") -> 这个应该返回的是 true.
我尝试写这个递归程序的时候也犯了跟你同样的错误了。
我的代码如下，感觉跟你思路挺像，但是我的多了个 while loop. （那个 do-while
loop 不用管它，只是跳掉多余的 '*'s).
bool isMatch(const char *s, const char *p) {
assert(s && p);
if (*p == '\0') return *s == '\0';
// next char is not '*': must match current character
if (*(p+1) != '*') {
assert(*p != '*');
return ((*p == *s) || (*p == '.' && *s != '\0')) && isMatch(s+1, p+1);
}
// next char is '*'
char rep = *p; // repeated char
// skips extra '*'s
do {
p++;
} while (*p == '*');
while ((rep == '.' && *s != '\0') || *s == rep) {
if (isMatch(s, p)) return true;
s++;
}
return isMatch(s, p);
}

NULL

【在 p****e 的大作中提到】

: 贴个楼主事后写的：
: bool _re_match(const char *str, const char *pattern, char prev_char) {
: // 如果str匹配完，检查pattern是否匹配完，或者还剩一个"*"
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: {
: // 如果当前pattern char不是'*', 试图匹配当前的str char, 然后
: 继续。

i*e2012-10-28 07:10

50 楼

还有一个不确定是不是 bug, 还是我理解有问题:
isMatch("ab", ".*") -> 我觉得返回的应该是 false. 你们的代码返回的都是 true.
我的理解是：
.* 这里指的是重复任意字母0或多次。而任意字母在以上例子只能填 'a'，没有其它选
择。

p*e2012-10-28 07:10

51 楼

谢谢指出bug啊，应该在第一个判断有没有匹配完的case中也检查下b*完全跳过的情况
。。。
is_match("ab", ".*")应该返回true, 我当时向面试官确认过。而且现实中的regex中
的".*"是可以匹配任意字符串的。这里应该是认为"*"的优先级比"."要高，所以".*"可
以当成的"",".","..","..." ...中的任意一个。

【在 i**********e 的大作中提到】

: 还有一个不确定是不是 bug, 还是我理解有问题:
: isMatch("ab", ".*") -> 我觉得返回的应该是 false. 你们的代码返回的都是 true.
: 我的理解是：
: .* 这里指的是重复任意字母0或多次。而任意字母在以上例子只能填 'a'，没有其它选
: 择。

i*e2012-10-28 07:10

52 楼

多谢！
明白了，你这样说似乎很有道理。
while loop 加多一个判断条件，搞定！

【在 p****e 的大作中提到】

: 谢谢指出bug啊，应该在第一个判断有没有匹配完的case中也检查下b*完全跳过的情况
: 。。。
: is_match("ab", ".*")应该返回true, 我当时向面试官确认过。而且现实中的regex中
: 的".*"是可以匹配任意字符串的。这里应该是认为"*"的优先级比"."要高，所以".*"可
: 以当成的"",".","..","..." ...中的任意一个。

i*e2012-10-28 07:10

53 楼

我一开始也想多传一个参数进去，但是其实直接看下个字符就可以了。
1）如果下个字符是 '*' 的话，那此字符必须 match pattern 字符。
2）如果下个字符不是 '*' 的话，那就做 brute force matching，一个一个 match 直
到字符与所 matched 字符不一样为止。

NULL

【在 p****e 的大作中提到】

: 贴个楼主事后写的：
: bool _re_match(const char *str, const char *pattern, char prev_char) {
: // 如果str匹配完，检查pattern是否匹配完，或者还剩一个"*"
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: {
: // 如果当前pattern char不是'*', 试图匹配当前的str char, 然后
: 继续。

p*e2012-10-28 07:10

54 楼

:), 看了你的代码我也发现了，这样写简单很多。。呵呵

【在 i**********e 的大作中提到】

: 我一开始也想多传一个参数进去，但是其实直接看下个字符就可以了。
: 1）如果下个字符是 '*' 的话，那此字符必须 match pattern 字符。
: 2）如果下个字符不是 '*' 的话，那就做 brute force matching，一个一个 match 直
: 到字符与所 matched 字符不一样为止。
:
: NULL

s*x2012-10-28 07:10

55 楼

这个 ihas1337code 大侠曾提到过非递归算法，
http://drdobbs.com/cpp/210200888
昨天晚上测试发现，他对 "aaab" 和 "*aab" 的match 就不对，原因是他总是回溯
patten string for no match case, 对原串回溯不够。
in this case, when it sees "aaa" and "aab" does not match, it will try to
match "ab" and "aab", which is not correct.
for the source string, we should advance one char by one char like you did
when a no match is found.

NULL
1,

【在 p****e 的大作中提到】

: 另外，也写了下wildchar的match, 要比regular expression简洁许多(思路与上面类似
: ）：
: bool wildchar_match(const char *str, const char *pattern)
: {
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: return (*pattern == '.' || *pattern == *str) ? wildchar_match(str+1,
: pattern+1) : false;

x*n2012-10-28 07:10

56 楼

第二题用栈来解决好像很容易。不过可能跟上面说的递归的算法是一样的。
public class patternMatch {
static java.util.Stack stackS = new java.util.Stack>();
static java.util.Stack stackP = new java.util.Stack>();
static void initStack( char [] s, char [] p ) {
for( char s1 : s ) {
stackS.add(s1);
}
for( char p1 : p ) {
stackP.add(p1);
}
}
static boolean is_match_stack( char [] s, char [] p) {
while( (!stackS.empty()) && (!stackP.empty()) ) {
if( stackP.peek() == '.' ) {
stackP.pop();
stackS.pop();
continue;
}
if( stackP.peek() == stackS.peek() ) {
stackP.pop();
stackS.pop();
continue;
}
if( stackP.peek() == '*' ) {
stackP.pop();
char tmps = stackS.pop();
char tmpp = stackP.pop();
if( tmpp != tmps && tmpp != '.' ) {
if( stackP.pop() != tmps ) {
return false;
}
else {
//'*' is passed
continue;
}
}
//tmpp == tmps
while( !stackP.empty() && stackP.peek() == tmpp ) {
stackP.pop();
}
while( !stackS.empty() && stackS.peek() == tmps ) {
stackS.pop();
}
}
}
if( !stackS.empty() ) return false;
while( !stackP.empty() ) {
if( stackP.pop() == '*' ) {
if( stackP.peek() != '*' ) {
stackP.pop();
}
}
else {
return false;
}
}
return true;
}
public static void main( String [] args ) {
char [] s = { 's','s','s', 'w', 'a', 'b', 'r' };
char [] p = { 'a','*','a','*', 's', '*', '.','a', 'b', 'r','*' };
initStack( s, p );
System.out.println(s);
System.out.println(p);
System.out.println( is_match_stack(s,p) );
}
}

H*e2012-10-28 07:10

57 楼

wildcard是怎么match的？
* match any sequence including empty string
? match exactly one char?
不象regular expression的*, 要跟前面的char有关系，是这样的吗？

NULL
1,

【在 p****e 的大作中提到】

: 另外，也写了下wildchar的match, 要比regular expression简洁许多(思路与上面类似
: ）：
: bool wildchar_match(const char *str, const char *pattern)
: {
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: return (*pattern == '.' || *pattern == *str) ? wildchar_match(str+1,
: pattern+1) : false;

w*02012-10-28 07:10

58 楼

测试"baacd", "b.*d"不对，在‘*’的情况下，只能匹配一次前面的字符，匹配了a，
就不能再用来匹配c。匹配后固定一下prev_char应该可以解决问题。

NULL

【在 p****e 的大作中提到】

: 贴个楼主事后写的：
: bool _re_match(const char *str, const char *pattern, char prev_char) {
: // 如果str匹配完，检查pattern是否匹配完，或者还剩一个"*"
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: {
: // 如果当前pattern char不是'*', 试图匹配当前的str char, 然后
: 继续。

h*i2012-10-28 07:10

59 楼

贴个新手写的
// compare two characters
bool char_equal(const char *a, const char *b)
{
if (*a!='\0')
return (*b=='.' || *a==*b);
else
return *b=='\0';
}
// the required function
bool isMatch(const char *s, const char *p)
{
if (*s=='\0' && *p=='\0')
return true;
if (*p=='\0')
return *s=='\0';
if (*(p+1)=='\0')
{
if (*s=='\0')
return false;
else
return (char_equal(s,p) && *(s+1)=='\0');
}
if (*(p+1)!='*')
{
if (*s=='\0')
return false;
else
return char_equal(s,p) && isMatch(s+1,p+1);
}
else
return isMatch(s,p+2) || (!char_equal(s,p) ? false : isMatch
(s+1,p));
}

h*i2012-10-28 07:10

60 楼

Does your program pass re_Match("",".*")?

NULL

【在 p****e 的大作中提到】

: 贴个楼主事后写的：
: bool _re_match(const char *str, const char *pattern, char prev_char) {
: // 如果str匹配完，检查pattern是否匹配完，或者还剩一个"*"
: if (*str == NULL)
: return (*pattern == NULL || (*pattern == '*' && *(pattern+1) == NULL
: )) ? true : false;
: if (*pattern != '*')
: {
: // 如果当前pattern char不是'*', 试图匹配当前的str char, 然后
: 继续。

h*i2012-10-28 07:10

61 楼

简化后的code，但还是慢
bool isMatch(const char *s, const char *p)
{
if (*p=='\0')
return *s=='\0';
if (*(p+1)!='*')
return char_equal(s,p)? isMatch(s+1,p+1):false;
else
return isMatch(s,p+2) || (!char_equal(s,p) ? false : isMatch
(s+1,p));
}

【在 h**i 的大作中提到】

: 贴个新手写的
: // compare two characters
: bool char_equal(const char *a, const char *b)
: {
: if (*a!='\0')
: return (*b=='.' || *a==*b);
: else
: return *b=='\0';
: }
: // the required function