Redian新闻
>
java regex pattern question
avatar
java regex pattern question# Java - 爪哇娇娃
c*t
1
Given strings like this "........$1....$2...."
I want to get two arrays:
String[] for texts around $1, $2, etc ($[0-9]+, including $0).
and
String[] that just contain $1 $2 etc.
Any very quick ways of doing this? I could do string length
counting etc, but I am wondering if there are simpler ways.
Thanks.
avatar
g*y
2
you can input.split("$[0-9]+") to get contents, but there is no simple way
to extract out splitters. Perhaps count is still the most efficient way in
Java.
ArrayList spliter = new ArrayList();
ArrayList content = new ArrayList();
Pattern pattern = Pattern.compile("\\$[0-9]+");
Matcher matcher = pattern.matcher(input);
int index = 0;
while (matcher.find()) {
int start = matcher.start();
int end = matcher.end();
if (start > index) {
content.add(i

【在 c*****t 的大作中提到】
: Given strings like this "........$1....$2...."
: I want to get two arrays:
: String[] for texts around $1, $2, etc ($[0-9]+, including $0).
: and
: String[] that just contain $1 $2 etc.
: Any very quick ways of doing this? I could do string length
: counting etc, but I am wondering if there are simpler ways.
: Thanks.

avatar
c*t
3
Thanks.
BTW, what's the difference between ArrayList and Vector? I've never
ever used ArrayList.

【在 g**********y 的大作中提到】
: you can input.split("$[0-9]+") to get contents, but there is no simple way
: to extract out splitters. Perhaps count is still the most efficient way in
: Java.
: ArrayList spliter = new ArrayList();
: ArrayList content = new ArrayList();
: Pattern pattern = Pattern.compile("\\$[0-9]+");
: Matcher matcher = pattern.matcher(input);
: int index = 0;
: while (matcher.find()) {
: int start = matcher.start();

avatar
g*y
4
Vector is thread-safe, ArrayList is lighter.
avatar
g*g
5
Use ArrayList boost the performance a little bit, while not
thread-safe

【在 c*****t 的大作中提到】
: Thanks.
: BTW, what's the difference between ArrayList and Vector? I've never
: ever used ArrayList.

avatar
c*t
6
Thank you both. That's good to know. I've always thought they
were the same. I guess that I will use ArrayList from now on :)

【在 g*****g 的大作中提到】
: Use ArrayList boost the performance a little bit, while not
: thread-safe

avatar
m*t
7

Something along these lines (not tested or even compiled):
ArrayList parts = new ArrayList();
ArrayList vars = new ArrayList();
Pattern p = Pattern.compile("([^$]*)(\$\d+)");
Matcher m = p.matcher(string);
while (m.find()) {
``parts.add(m.group(1));
``vars.add(m.group(2));
}
int lastMatch = m.end();
if (lastMatch < string.length()) {
``parts.add(string.substring(lastMatch));
}

【在 c*****t 的大作中提到】
: Given strings like this "........$1....$2...."
: I want to get two arrays:
: String[] for texts around $1, $2, etc ($[0-9]+, including $0).
: and
: String[] that just contain $1 $2 etc.
: Any very quick ways of doing this? I could do string length
: counting etc, but I am wondering if there are simpler ways.
: Thanks.

avatar
m*t
8

Maybe add an empty string check on m.group(1) here
if you don't care about those.

【在 m******t 的大作中提到】
:
: Something along these lines (not tested or even compiled):
: ArrayList parts = new ArrayList();
: ArrayList vars = new ArrayList();
: Pattern p = Pattern.compile("([^$]*)(\$\d+)");
: Matcher m = p.matcher(string);
: while (m.find()) {
: ``parts.add(m.group(1));
: ``vars.add(m.group(2));
: }

avatar
Z*e
9
how about matching using
Pattern p = Pattern.compile("(.*)(\\$1)(.*)(\\$2)(.*)");
then extract group 1, 3, 5 as the first array, 2, 4 as the second array
but what's the use of second array anyways? it'll be {"$1", "$2"} always,
unless $1 and $2 are meant for stand-ins of some strings?

【在 c*****t 的大作中提到】
: Given strings like this "........$1....$2...."
: I want to get two arrays:
: String[] for texts around $1, $2, etc ($[0-9]+, including $0).
: and
: String[] that just contain $1 $2 etc.
: Any very quick ways of doing this? I could do string length
: counting etc, but I am wondering if there are simpler ways.
: Thanks.

avatar
c*t
10
I basically used gloomyturkey's code. The reason that I needed to use
was that I need to translate $$ and $1, etc to different variables.
$$ and $1 formats cannot be changed to something Java like, so this
is why.
With minor changes to gloomyturkey's code, I can use FreeMarker to
write a convoluted translation routine that takes 4 parameters:
${a?type(b,c,d)}
where type is a custom built-in for FreeMarker. a is an object
b is $/1/2/3 (converted from $$, $1 etc), c

【在 Z****e 的大作中提到】
: how about matching using
: Pattern p = Pattern.compile("(.*)(\\$1)(.*)(\\$2)(.*)");
: then extract group 1, 3, 5 as the first array, 2, 4 as the second array
: but what's the use of second array anyways? it'll be {"$1", "$2"} always,
: unless $1 and $2 are meant for stand-ins of some strings?

相关阅读
logo
联系我们隐私协议©2024 redian.news
Redian新闻
Redian.news刊载任何文章,不代表同意其说法或描述,仅为提供更多信息,也不构成任何建议。文章信息的合法性及真实性由其作者负责,与Redian.news及其运营公司无关。欢迎投稿,如发现稿件侵权,或作者不愿在本网发表文章,请版权拥有者通知本网处理。