Java Matcher源码学习记录
Java Matcher源码学习记录
zhuqianli 发表于8个月前
Java Matcher源码学习记录
  • 发表于 8个月前
  • 阅读 0
  • 收藏 0
  • 点赞 0
  • 评论 0

移动开发云端新模式探索实践 >>>   

Pattern p = Pattern.compile("cat");
Matcher m = p.matcher("one cat two cats in the yard");
StringBuffer sb = new StringBuffer();
while (m.find()) {
   m.appendReplacement(sb, "dog");
}
m.appendTail(sb);
System.out.println(sb.toString());

 

public Matcher appendReplacement(StringBuffer sb, String replacement) {

    // If no match, return error
    if (first < 0) //first表示正则匹配到的第一个字符在字符串(one cat two cats in the yard)中的下标
        throw new IllegalStateException("No match available");

    // Process substitution string to replace group references with groups
    int cursor = 0; //当前操作字符在替代字符串(dog)中的下标
    StringBuilder result = new StringBuilder();//解析替代字符串的结果
    //开始解析
    while (cursor < replacement.length()) {
        char nextChar = replacement.charAt(cursor);//获取当前操作的字符
        if (nextChar == '\\') {//如果当前操作的字符是转义符号
            //下面就是解析replacement转义符号的过程
            cursor++;//跳过转义符号
            if (cursor == replacement.length())
                throw new IllegalArgumentException(
                    "character to be escaped is missing");
            nextChar = replacement.charAt(cursor);//获取转义符号后面的字符
            result.append(nextChar);//添加到result
            cursor++;//下标加一  回到上面while循环
        } else if (nextChar == '$') {//如果当前操作的字符是$
            //下面就是解析replacement中如何引用捕获组的内容
            // Skip past $
            cursor++;// 跳过$符号
            // Throw IAE if this "$" is the last character in replacement
            if (cursor == replacement.length())
               throw new IllegalArgumentException(
                    "Illegal group reference: group index is missing");
            nextChar = replacement.charAt(cursor);//获取$后面的字符
            int refNum = -1;
            if (nextChar == '{') {//解析命名捕获组
                cursor++;
                StringBuilder gsb = new StringBuilder();//提取${groupname}中的groupname字符串
                while (cursor < replacement.length()) {
                    nextChar = replacement.charAt(cursor);//获取{的下一个字符
                    if (ASCII.isLower(nextChar) ||
                        ASCII.isUpper(nextChar) ||
                        ASCII.isDigit(nextChar)) {//这个字符是大小写的字母或数字
                        gsb.append(nextChar);
                        cursor++;
                    } else {//不是的话退出循环  后面判断是不是这个字符是不是},不是就抛异常
                        break;
                    }
                }
                if (gsb.length() == 0)
                    throw new IllegalArgumentException(
                        "named capturing group has 0 length name");
                if (nextChar != '}')
                    throw new IllegalArgumentException(
                        "named capturing group is missing trailing '}'");
                String gname = gsb.toString();// group name
                if (ASCII.isDigit(gname.charAt(0))) //group name 不能以数字开头
                    throw new IllegalArgumentException(
                        "capturing group name {" + gname +
                        "} starts with digit character");
                if (!parentPattern.namedGroups().containsKey(gname)) //捕获组的Map中要有相对应的group name
                    throw new IllegalArgumentException(
                        "No group with name {" + gname + "}");
                refNum = parentPattern.namedGroups().get(gname);//获取在捕获组中是第几个 begin with 1
                cursor++;
            } else {
                // The first number is always a group
                refNum = (int)nextChar - '0'; //减去字符0的ASCII码
                if ((refNum < 0)||(refNum > 9))//不是数字的话 抛异常
                    throw new IllegalArgumentException(
                        "Illegal group reference");
                cursor++;
                // Capture the largest legal group string
                boolean done = false;
                while (!done) {
                    if (cursor >= replacement.length()) {//当前操作是最后一个字符或者后面已经没有字符了
                        break;
                    }
                    int nextDigit = replacement.charAt(cursor) - '0';//减去字符0的ASCII码
                    if ((nextDigit < 0)||(nextDigit > 9)) { // not a number  数字后面不是数字了
                        break;
                    }
                    int newRefNum = (refNum * 10) + nextDigit;//还是数字  乘10累加上去
                    if (groupCount() < newRefNum) {//累加后发现数值大于捕获组的总数
                        done = true;//结束返回refNum中的值
                    } else {//累加后的数值小于或等于捕获组总数  继续循环下一个字符
                        refNum = newRefNum;/
                        cursor++;
                    }
                }
            }
            // Append group
            if (start(refNum) != -1 && end(refNum) != -1)
                result.append(text, start(refNum), end(refNum));   //获取对应捕获组的开始下标和结束下标
        } else {//正常字符直接添加
            result.append(nextChar);
            cursor++;
        }
    }
    // Append the intervening text
    sb.append(text, lastAppendPosition, first);//上一次添加位置到捕获组第一个字符位置
    // Append the match substitution
    sb.append(result);//添加解析后的替代字符串

    lastAppendPosition = last;
    return this;
}
  • 打赏
  • 点赞
  • 收藏
  • 分享
共有 人打赏支持
粉丝 5
博文 91
码字总数 56989
×
zhuqianli
如果觉得我的文章对您有用,请随意打赏。您的支持将鼓励我继续创作!
* 金额(元)
¥1 ¥5 ¥10 ¥20 其他金额
打赏人
留言
* 支付类型
微信扫码支付
打赏金额:
已支付成功
打赏金额: