I've been manually converting articles into Markdown syntax for a few days now, and it's getting rather tedious. Some of these are 3 or 4 pages, italics and other emphasized text throughout. Is there a faster way to convert (.rtf|.doc) files to clean Markdown Syntax that I can take advantage of?
7条回答 默认 最新
- donglu9743 2011-05-09 20:53关注
ProgTips has a possible solution with a Word macro (source download):
A simple macro (source download) for converting the most trivial things automatically. This macro does:
- Replace bold and italics
- Replace headings (marked heading 1-6)
- Replace numbered and bulleted lists
It's very buggy, I believe it hangs on larger documents, however I'm NOT stating it's a stable release anyway! :-) Experimental use only, recode and reuse it as you like, post a comment if you've found a better solution.
Source: ProgTips
Macro source
Installation
- open WinWord,
- press Alt+F11 to open the VBA editor,
- right click the first project in the project browser
- choose insert->module
- paste the code from the file
- close macro editor
- go tools>macro>macros; run the macro named MarkDown
Source: ProgTips
Source
Macro source for safe keeping if ProgTips deletes the post or the site gets wiped out:
'*** A simple MsWord->Markdown replacement macro by Kriss Rauhvargers, 2006.02.02. '*** This tool does NOT implement all the markup specified in MarkDown definition by John Gruber, only '*** the most simple things. These are: '*** 1) Replaces all non-list paragraphs to ^p paragraph so MarkDown knows it is a stand-alone paragraph '*** 2) Converts tables to text. In fact, tables get lost. '*** 3) Adds a single indent to all indented paragraphs '*** 4) Replaces all the text in italics to _text_ '*** 5) Replaces all the text in bold to **text** '*** 6) Replaces Heading1-6 to #..#Heading (Heading numbering gets lost) '*** 7) Replaces bulleted lists with ^p * listitem ^p* listitem2... '*** 8) Replaces numbered lists with ^p 1. listitem ^p2. listitem2... '*** Feel free to use and redistribute this code Sub MarkDown() Dim bReplace As Boolean Dim i As Integer Dim oPara As Paragraph 'remove formatting from paragraph sign so that we dont get **blablabla^p** but rather **blablabla**^p Call RemoveBoldEnters For i = Selection.Document.Tables.Count To 1 Step -1 Call Selection.Document.Tables(i).ConvertToText Next 'simple text indent + extra paragraphs for non-numbered paragraphs For i = Selection.Document.Paragraphs.Count To 1 Step -1 Set oPara = Selection.Document.Paragraphs(i) If oPara.Range.ListFormat.ListType = wdListNoNumbering Then If oPara.LeftIndent > 0 Then oPara.Range.InsertBefore (">") End If oPara.Range.InsertBefore (vbCrLf) End If Next 'italic -> _italic_ Selection.HomeKey Unit:=wdStory bReplace = ReplaceOneItalic 'first replacement While bReplace 'other replacements bReplace = ReplaceOneItalic Wend 'bold-> **bold** Selection.HomeKey Unit:=wdStory bReplace = ReplaceOneBold 'first replacement While bReplace bReplace = ReplaceOneBold 'other replacements Wend 'Heading -> ##heading For i = 1 To 6 'heading1 to heading6 Selection.HomeKey Unit:=wdStory bReplace = ReplaceH(i) 'first replacement While bReplace bReplace = ReplaceH(i) 'other replacements Wend Next Call ReplaceLists Selection.HomeKey Unit:=wdStory End Sub '*************************************************************** ' Function to replace bold with _bold_, only the first occurance ' Returns true if any occurance found, false otherwise ' Originally recorded by WinWord macro recorder, probably contains ' quite a lot of useless code '*************************************************************** Function ReplaceOneBold() As Boolean Dim bReturn As Boolean Selection.Find.ClearFormatting With Selection.Find .Text = "" .Forward = True .Wrap = wdFindContinue .Font.Bold = True .Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With bReturn = False While Selection.Find.Execute = True bReturn = True Selection.Text = "**" & Selection.Text & "**" Selection.Font.Bold = False Selection.Find.Execute Wend ReplaceOneBold = bReturn End Function '******************************************************************* ' Function to replace italic with _italic_, only the first occurance ' Returns true if any occurance found, false otherwise ' Originally recorded by WinWord macro recorder, probably contains ' quite a lot of useless code '******************************************************************** Function ReplaceOneItalic() As Boolean Dim bReturn As Boolean Selection.Find.ClearFormatting With Selection.Find .Text = "" .Forward = True .Wrap = wdFindContinue .Font.Italic = True .Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With bReturn = False While Selection.Find.Execute = True bReturn = True Selection.Text = "_" & Selection.Text & "_" Selection.Font.Italic = False Selection.Find.Execute Wend ReplaceOneItalic = bReturn End Function '********************************************************************* ' Function to replace headingX with #heading, only the first occurance ' Returns true if any occurance found, false otherwise ' Originally recorded by WinWord macro recorder, probably contains ' quite a lot of useless code '********************************************************************* Function ReplaceH(ByVal ipNumber As Integer) As Boolean Dim sReplacement As String Select Case ipNumber Case 1: sReplacement = "#" Case 2: sReplacement = "##" Case 3: sReplacement = "###" Case 4: sReplacement = "####" Case 5: sReplacement = "#####" Case 6: sReplacement = "######" End Select Selection.Find.ClearFormatting Selection.Find.Style = ActiveDocument.Styles("Heading " & ipNumber) With Selection.Find .Text = "" .Replacement.Text = "" .Forward = True .Wrap = wdFindContinue .Format = True .MatchCase = False .MatchWholeWord = False .MatchWildcards = False .MatchSoundsLike = False .MatchAllWordForms = False End With bReturn = False While Selection.Find.Execute = True bReturn = True Selection.Range.InsertBefore (vbCrLf & sReplacement & " ") Selection.Style = ActiveDocument.Styles("Normal") Selection.Find.Execute Wend ReplaceH = bReturn End Function '*************************************************************** ' A fix-up for paragraph marks that ar are bold or italic '*************************************************************** Sub RemoveBoldEnters() Selection.HomeKey Unit:=wdStory Selection.Find.ClearFormatting Selection.Find.Font.Italic = True Selection.Find.Replacement.ClearFormatting Selection.Find.Replacement.Font.Bold = False Selection.Find.Replacement.Font.Italic = False With Selection.Find .Text = "^p" .Replacement.Text = "^p" .Forward = True .Wrap = wdFindContinue .Format = True End With Selection.Find.Execute Replace:=wdReplaceAll Selection.HomeKey Unit:=wdStory Selection.Find.ClearFormatting Selection.Find.Font.Bold = True Selection.Find.Replacement.ClearFormatting Selection.Find.Replacement.Font.Bold = False Selection.Find.Replacement.Font.Italic = False With Selection.Find .Text = "^p" .Replacement.Text = "^p" .Forward = True .Wrap = wdFindContinue .Format = True End With Selection.Find.Execute Replace:=wdReplaceAll End Sub '*************************************************************** ' Function to replace bold with _bold_, only the first occurance ' Returns true if any occurance found, false otherwise ' Originally recorded by WinWord macro recorder, probably contains ' quite a lot of useless code '*************************************************************** Sub ReplaceLists() Dim i As Integer Dim j As Integer Dim Para As Paragraph Selection.HomeKey Unit:=wdStory 'iterate through all the lists in the document For i = Selection.Document.Lists.Count To 1 Step -1 'check each paragraph in the list For j = Selection.Document.Lists(i).ListParagraphs.Count To 1 Step -1 Set Para = Selection.Document.Lists(i).ListParagraphs(j) 'if it's a bulleted list If Para.Range.ListFormat.ListType = wdListBullet Then Para.Range.InsertBefore (ListIndent(Para.Range.ListFormat.ListLevelNumber, "*")) 'if it's a numbered list ElseIf Para.Range.ListFormat.ListType = wdListSimpleNumbering Or _ wdListMixedNumbering Or _ wdListListNumOnly Then Para.Range.InsertBefore (Para.Range.ListFormat.ListValue & ". ") End If Next j 'inserts paragraph marks before and after, removes the list itself Selection.Document.Lists(i).Range.InsertParagraphBefore Selection.Document.Lists(i).Range.InsertParagraphAfter Selection.Document.Lists(i).RemoveNumbers Next i End Sub '*********************************************************** ' Returns the MarkDown indent text '*********************************************************** Function ListIndent(ByVal ipNumber As Integer, ByVal spChar As String) As String Dim i As Integer For i = 1 To ipNumber - 1 ListIndent = ListIndent & " " Next ListIndent = ListIndent & spChar & " " End Function
Source: ProgTips
解决 无用评论 打赏 举报
悬赏问题
- ¥20 BAPI_PR_CHANGE how to add account assignment information for service line
- ¥500 火焰左右视图、视差(基于双目相机)
- ¥100 set_link_state
- ¥15 虚幻5 UE美术毛发渲染
- ¥15 CVRP 图论 物流运输优化
- ¥15 Tableau online 嵌入ppt失败
- ¥100 支付宝网页转账系统不识别账号
- ¥15 基于单片机的靶位控制系统
- ¥15 真我手机蓝牙传输进度消息被关闭了,怎么打开?(关键词-消息通知)
- ¥15 装 pytorch 的时候出了好多问题,遇到这种情况怎么处理?