JavaScript正则表达式详解

正则表达式是用于匹配字符串中字符组合的模式，在JavaScript中广泛应用于字符串的搜索、替换、验证等操作。下面为你详细介绍JavaScript中的正则表达式。

一、创建正则表达式

在JavaScript中，有两种创建正则表达式的方式：

1. 使用正则表达式字面量

正则表达式字面量是JavaScript中创建正则表达式对象的一种语法形式，其格式为：

// 语法：/pattern/flags
const regex1 = /hello/;
const regex2 = /world/g;

其中：

开头和结尾的 / 是定界符，用于标识这是一个正则表达式字面量，用于标记正则表达式的开始和结束。
pattern 是正则表达式的模式内容，如 \d 表示匹配任意数字。
flags 是可选的修饰符，如 g（全局匹配）、i（忽略大小写）等。

举例说明

/\d/ - 匹配任意单个数字
/hello/g - 全局匹配字符串中的所有"hello"
/world/i - 忽略大小写匹配"world"

2. 使用RegExp构造函数

除了使用字面量语法，还可以使用 RegExp 构造函数创建正则表达式：

// 语法：new RegExp(pattern, [flags])
const regex3 = new RegExp('hello');
const regex4 = new RegExp('world', 'g');
const regex5 = new RegExp('\\d'); // 注意这里需要双重转义

在这种情况下，就不需要使用 / 作为定界符了。

二、正则表达式的组成部分

1. 字符类

. - 匹配除换行符外的任意字符
\d - 匹配数字 [0-9]
\D - 匹配非数字 [^0-9]
\w - 匹配字母、数字、下划线 [A-Za-z0-9_]
\W - 匹配非字母、数字、下划线 [^A-Za-z0-9_]
\s - 匹配空白字符（空格、制表符、换行符等）
\S - 匹配非空白字符
[abc] - 匹配方括号内的任意一个字符
[^abc] - 匹配除方括号内字符外的任意字符

2. 量词

* - 匹配前面的字符0次或多次
+ - 匹配前面的字符1次或多次
? - 匹配前面的字符0次或1次
{n} - 匹配前面的字符恰好n次
{n,} - 匹配前面的字符至少n次
{n,m} - 匹配前面的字符n到m次

3. 锚点

^ - 匹配字符串的开始
$ - 匹配字符串的结束
\b - 匹配单词边界
\B - 匹配非单词边界

4. 分组和捕获

(abc) - 分组，将abc作为一个整体匹配
(?:abc) - 非捕获分组
| - 或运算符，匹配左边或右边的表达式

三、正则表达式标志（Flags）

g - 全局匹配，找到所有匹配项，而不是在找到第一个匹配项后停止
i - 忽略大小写
m - 多行模式，^和$匹配每行的开始和结束
s - 允许 . 匹配换行符
u - Unicode模式，正确处理UTF-16编码
y - 粘性匹配，从目标字符串的当前位置开始匹配

四、正则表达式方法

RegExp对象的方法

1. test() - 测试字符串是否匹配正则表达式，返回布尔值。

const regex = /hello/;
console.log(regex.test('hello world')); // true

2. exec() - 执行匹配，返回匹配结果的数组，或null（非全局匹配时）

const regex = /hello/;
console.log(regex.exec('hello world')); // ["hello", index: 0, input: "hello world", groups: undefined]

对于全局匹配（g标志），exec()会在每次调用时更新lastIndex属性：

const regex = /hello/g;
console.log(regex.exec('hello hello')); // ["hello", index: 0, ...]
console.log(regex.exec('hello hello')); // ["hello", index: 6, ...]
console.log(regex.exec('hello hello')); // null

String对象的正则方法

1. match() - 找到一个或多个匹配项

const str = 'hello hello';
console.log(str.match(/hello/)); // ["hello", index: 0, ...] （非全局）
console.log(str.match(/hello/g)); // ["hello", "hello"] （全局）

2. replace() - 替换匹配的子串

const str = 'hello world';
console.log(str.replace(/world/, 'JavaScript')); // "hello JavaScript"

// 使用函数作为替换值
const str2 = 'hello 123 world 456';
console.log(str2.replace(/\d+/g, match => parseInt(match) * 2)); // "hello 246 world 912"

3. search() - 搜索匹配项，返回第一个匹配项的索引，否则返回-1

const str = 'hello world';
console.log(str.search(/world/)); // 6

4. split() - 根据匹配项分割字符串

const str = 'apple,banana,orange';
console.log(str.split(/,/)); // ["apple", "banana", "orange"]

五、实用示例

1. 验证邮箱

const emailRegex = /^[a-zA-Z0-9._-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,6}$/;
console.log(emailRegex.test('test@example.com')); // true

2. 验证手机号码（中国）

const phoneRegex = /^1[3-9]\d{9}$/;
console.log(phoneRegex.test('13812345678')); // true

3. 提取URL中的域名

const url = 'https://www.example.com/path?query=value';
const domainRegex = /https?:\/\/(www\.)?([^/]+)/;
const match = url.match(domainRegex);
console.log(match[2]); // "example.com"

4. 驼峰命名转下划线命名

const camelCase = 'helloWorldExample';
const snakeCase = camelCase.replace(/([A-Z])/g, '_$1').toLowerCase();
console.log(snakeCase); // "hello_world_example"

六、注意事项

正则表达式是复杂的，应根据具体需求选择合适的模式。
某些特殊字符需要转义（使用\），如., *, +, ?, ^, $, {, }, [, ], \, |, (, ) 。
贪婪匹配 vs 非贪婪匹配：默认是贪婪匹配（尽可能多匹配），在量词后 ? 可变为非贪婪匹配。

  const str = '<div>content1</div><div>content2</div>';
  console.log(str.match(/<div>.*<\/div>/)); // 贪婪匹配，匹配整个字符串
  console.log(str.match(/<div>.*?<\/div>/)); // 非贪婪匹配，只匹配第一个<div>...</div>