正则表达式匹配某些标记之外的所有新行字符

I need to match all new line characters outside of a particular html tag or pseudotag.

Here is an example. I want to match all " "s ouside of [code] [/code] tags (in order to replace them with <br> tags) in this text fragment:

These concepts are represented by simple Python classes.  
Edit the polls/models.py file so it looks like this: 

[code]  
from django.db import models

class Question(models.Model):
    question_text = models.CharField(max_length=200)
    pub_date = models.DateTimeField('date published') 
[/code]

I know that I should use negative lookaheads, but I'm struggling to figure the whole thing out.

Specifically, I need a PCRE expression, I will use it with PHP and perhaps Python.

写回答
好问题 0 提建议
追加酬金
关注问题
分享
邀请回答
编辑收藏删除结题
收藏举报

2条回答默认最新

关注

码龄粉丝数原力等级 --

被采纳

被点赞

采纳率
dongmu1951 2014-05-28 09:51
关注
To me, this situation seems to be straight out of Match (or replace) a pattern except in situations s1, s2, s3 etc. Please visit that link for full discussion of the solution.

I will give you answers for both PHP and Python (since the example mentioned django).

PHP

(?s)\[code\].*?\[/code\](*SKIP)(*F)|

The left side of the alternation matches complete [code]...[/code] tags, then deliberately fails, and skips the part of the string that was just matched. The right side matches newlines, and we know they are the right newlines because they were not matched by the expression on the left.

This PHP program shows how to use the regex (see the results at the bottom of the online demo):

<?php $regex = '~(?s)\[code\].*?\[/code\](*SKIP)(*F)| ~'; $subject = "These concepts are represented by simple Python classes. Edit the polls/models.py file so it looks like this: [code] from django.db import models class Question(models.Model): question_text = models.CharField(max_length=200) pub_date = models.DateTimeField('date published') [/code]"; $replaced = preg_replace($regex,"<br />",$subject); echo $replaced."<br /> "; ?>

Python

For Python, here's our simple regex:

(?s)\[code\].*?\[/code\]|( )

The left side of the alternation matches complete [code]...[/code] tags. We will ignore these matches. The right side matches and captures newlines to Group 1, and we know they are the right newlines because they were not matched by the expression on the left.

This Python program shows how to use the regex (see the results at the bottom of the online demo):

import re subject = """These concepts are represented by simple Python classes. Edit the polls/models.py file so it looks like this: [code] from django.db import models class Question(models.Model): question_text = models.CharField(max_length=200) pub_date = models.DateTimeField('date published') [/code]""" regex = re.compile(r'(?s)\[code\].*?\[/code\]|( )') def myreplacement(m): if m.group(1): return "<br />" else: return m.group(0) replaced = regex.sub(myreplacement, subject) print(replaced)
本回答被题主选为最佳回答 , 对您是否有帮助呢?

解决无用
评论打赏
分享
举报

评论

按下Enter换行，Ctrl+Enter发表内容

查看更多回答(1条)

报告相同问题？

关注问题

正则表达式匹配某些标记之外的所有新行字符 php python
2014-05-28 09:38

回答 2 已采纳 To me, this situation seems to be straight out of Match (or replace) a pattern except in situation
正则表达式匹配不包含某个字符串的字符串 python 正则表达式
2021-03-07 09:46

回答 2 已采纳。。。 import re l = [] res = re.findall('ABC.*?BCD', r'ABC/dABC/213BCD/sfoajs/ABC/dddd/BCD') fo
正则表达式匹配正负整数和正负小数或者空有问必答正则表达式
2021-08-25 15:28

回答 6 已采纳已私聊解决
php正则表达式除什么之外,正则表达式：匹配除特定模式以外的所有内容
2021-04-07 08:22

王司图的博客 )开头的字符串之外的所有内容的正则表达式您不希望匹配哪种特定模式？是否有原因为什么您不能匹配您的模式，并且如果字符串与之匹配则无法执行某些操作？正则表达式可能重复，以匹配不包含单词的行？正则表达式：...
使用java正则表达式匹配日期 java 正则表达式
2020-01-31 15:18

回答 1 已采纳 ``` ^\d{4}-0*((1|3|5|7|8|10|12)-0*([1-9]|[1-2]\d|3[0-1])|(4|6|9|11)-0*([1-9]|[1-2]\d|30)|2-0*([1-
python爬虫，当正则表达式无法匹配，怎么输出空字符 python 有问必答正则表达式爬虫
2021-09-01 16:19

回答 3 已采纳使用try except环绕即可
正则表达式匹配 正则表达式匹配 c语言
2021-11-11 23:14

回答 1 已采纳 public static boolean matchDP1(char[] str, char[] pattern) { if(str == null || pattern == n
php正则匹配字符_php中字符串和正则表达式详解
2021-03-22 19:47

weixin_39625098的博客一、字符串类型的特点1、PHP是弱类型语言，其他数据类型一般都可以直接应用于字符串函数操作。echo substr("123456",2,4); //输出345echo substr(123456,2,4); //输出345echo hello; //先查找hello常量，若没找到，...
正则表达式匹配golang中不以www开头的字符串
2018-10-04 13:48

回答 2 已采纳 If you're really bent on creating a negative lookahead manually, you will need to exclude all poss
C#正则表达式查找非纯数字的字符 c# 正则表达式
2022-04-27 01:53

回答 6 已采纳 (([a-zA-Z_])([a-zA-Z0-9_])+)|(([0-9])([a-zA-Z_])+)
正则表达式如何匹配指定的行有问必答正则表达式
2021-11-24 17:40

回答 2 已采纳分多次匹配,分别匹配大写,小写,数字,并计数匹配成功的次数,如果一行成功次数为2,则认为该行符合要求有帮助望采纳~
[PHP]常用正则表达式收集
2021-01-19 17:47

可以用来计算字符串的长度（一个双字节字符长度计2，ASCII字符计1）匹配空白行的正则表达式：\n\s*\r评注：可以用来删除空白行匹配HTML标记的正则表达式：<(\S*?)[^>]*>.*?</\1>|<.*? /
正则表达式匹配路径中文件 java 正则表达式
2018-09-29 02:59

回答 2 已采纳首先 file是你得到的文件 File[] files = file.listFiles(); 获取目录下的所有文件 List fileList = new ArrayList();//定义一个
php正则表达式的模式修正符和逆向引用使用介绍
2021-01-19 20:44

正则表达式的匹配先后顺序: 1.模式单元 2.重复匹配 ? * + {} 3.边界限定 ^ $ b B 4.模式选择 | 模式修正符: 模式修正符是标记在整个模式之外的. i :模式中的字符将同时匹配大小写字母. m :字符串视为多行. s :将字符...
php正则表达式获取所有中文字符,window_PHP一些常用的正则表达式，匹配中文字符的正则表达式： - phpStudy...
2021-05-04 02:45

weixin_39878716的博客 PHP一些常用的正则表达式匹配中文字符的正则表达式： [\u4e00-\u9fa5]匹配双字节字符(包括汉字在内)： [^\x00-\xff]应用：计算字符串的长度(一个双字节字符长度计2，ASCII字符计1)String.prototype.len=function(){...
没有解决我的问题, 去提问

悬赏问题

¥15 企业资源规划ERP沙盘模拟
¥15 前端echarts坐标轴问题
¥15 CMFCPropertyPage
¥15 ad5933的I2C
¥15 请问RTX4060的笔记本电脑可以训练yolov5模型吗？
¥15 数学建模求思路及代码
¥50 silvaco GaN HEMT有栅极场板的击穿电压仿真问题
¥15 谁会P4语言啊，我想请教一下
¥15 这个怎么改成直流激励源给加热电阻提供5a电流呀
¥50 求解vmware的网络模式问题别拿AI回答

正则表达式匹配某些标记之外的所有新行字符

2条回答 默认 最新

悬赏问题

2条回答默认最新