Skip to content

Commit 04d7d19

Browse files
committed
Add substitute_block method to TextRun, that can take a block wtih access to MatchData with regex capture groups etc
I needed to do a `substitute` where the match arg was a regex, and if it were gsub I'd be using a block to have access to capture groups in $1 $2 $3 etc. Because of the weird way variables $1 $2 $3 are handled in ruby and block scope, I couldn't provide a delegated block to give exactly the same API as ordinary gsub. My original idea was to do that, added on to existing #substitute. But instead, had to provide a new/alternate #substitute_block method, with a block that actually gets a MatchData object as arg, and can access whatever it needs from there, including capture groups and match string.
1 parent c5bcb57 commit 04d7d19

File tree

3 files changed

+35
-1
lines changed

3 files changed

+35
-1
lines changed

README.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -130,6 +130,14 @@ doc.paragraphs.each do |p|
130130
end
131131
end
132132

133+
# Substitute text with access to captures, note block arg is a MatchData, a bit
134+
# different than String.gsub. https://ruby-doc.org/3.3.7/MatchData.html
135+
doc.paragraphs.each do |p|
136+
p.each_text_run do |tr|
137+
tr.substitute_block(/total: (\d+)/) { |match_data| "total: #{match_data[1].to_i * 10}" }
138+
end
139+
end
140+
133141
# Save document to specified path
134142
doc.save('example-edited.docx')
135143
```
@@ -145,7 +153,7 @@ doc = Docx::Document.open('tables.docx')
145153
# Iterate over each table
146154
doc.tables.each do |table|
147155
last_row = table.rows.last
148-
156+
149157
# Copy last row and insert a new one before last row
150158
new_row = last_row.copy
151159
new_row.insert_before(last_row)

lib/docx/containers/text_run.rb

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -57,6 +57,19 @@ def substitute(match, replacement)
5757
reset_text
5858
end
5959

60+
# Weird things with how $1/$2 in regex blocks are handled means we can't just delegate
61+
# block to gsub to get block, we have to do it this way, with a block that gets a MatchData,
62+
# from which captures and other match data can be retrieved.
63+
# https://ruby-doc.org/3.3.7/MatchData.html
64+
def substitute_block(match, &block)
65+
@text_nodes.each do |text_node|
66+
text_node.content = text_node.content.gsub(match) { |_unused_matched_string|
67+
block.call(Regexp.last_match)
68+
}
69+
end
70+
reset_text
71+
end
72+
6073
def parse_formatting
6174
{
6275
italic: !@node.xpath('.//w:i').empty?,

spec/docx/document_spec.rb

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -206,6 +206,19 @@
206206

207207
expect(@doc.paragraphs[1].text).to eq('Multi-line paragraph line 1same paragraph line 2yet the same paragraph line3 ')
208208
end
209+
210+
it "should replace placeholder in any line of paragraph using substitute_block" do
211+
expect(@doc.paragraphs[0].text).to eq('Page title')
212+
expect(@doc.paragraphs[1].text).to eq('Multi-line paragraph line 1_placeholder2_ line 2_placeholder3_ line3 ')
213+
214+
@doc.paragraphs[1].each_text_run do |text_run|
215+
text_run.substitute_block(/_placeholder(\d)_/) { |match_data|
216+
"_replacement_#{match_data[1]}"
217+
}
218+
end
219+
220+
expect(@doc.paragraphs[1].text).to eq('Multi-line paragraph line 1_replacement_2 line 2_replacement_3 line3 ')
221+
end
209222
end
210223

211224
describe 'read formatting' do

0 commit comments

Comments
 (0)