<?xml version="1.0" encoding="UTF-8"?><!-- generator="wordpress/2.2.1" -->
<rss version="2.0" 
	xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
	<title>Comments on: “This Document Contains Renderable Text” Acrobat 8</title>
	<link>http://acrobatsupport.com/document-contains-renderable-text/</link>
	<description>Tips, Tricks, Tutorials and Support for Adobe Acrobat</description>
	<pubDate>Tue, 06 Jan 2009 04:48:55 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.1</generator>

	<item>
		<title>By: anon</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-5369</link>
		<author>anon</author>
		<pubDate>Sat, 27 Dec 2008 23:03:35 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-5369</guid>
		<description>Just an FYI, Nitro PDF has an option to insert bates numbers, but obviously you guys need the opposite ability.

Any software developers want to make some money, here's a great idea for a simple utility, that removes them and does nothing else. If 1000 people paid 50$ for such a utility (and believe me, software customized for the legal industry is expen$$$ive) that would earn you fifty thousand.

I might just have to program this myself. So what's the best programming language for this task?</description>
		<content:encoded><![CDATA[<p>Just an FYI, Nitro PDF has an option to insert bates numbers, but obviously you guys need the opposite ability.</p>
<p>Any software developers want to make some money, here&#8217;s a great idea for a simple utility, that removes them and does nothing else. If 1000 people paid 50$ for such a utility (and believe me, software customized for the legal industry is expen$$$ive) that would earn you fifty thousand.</p>
<p>I might just have to program this myself. So what&#8217;s the best programming language for this task?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Xochi</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-4817</link>
		<author>Xochi</author>
		<pubDate>Tue, 16 Dec 2008 02:30:50 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-4817</guid>
		<description>I've worked with a lot of legal exhibits that have gone to court and come out of court with what some refer to as Court Branding.  At the top of every page is blue text that identifies the document and page number.  It is that bit of text that interfers with the OCR process.  We are talking about thousands of documents (really) that need to be searchable.  The simple solution is to delete the text, on each and every page, sometimes 300 or 400 pages.  There has got to be a better way.  Tonight, I came across a similar problem with Bates numbers digitally stamped at the bottom of the page.  I could not delete that.  Cropping 600 pages (tonight's document) is time consuming and believe me this was a conglomeration of many different documents, different sizes, portrait and landscape.  Cropping would have cut off text that needed to be searched.  I guarantee you this is only the beginning, there will be many more of these types of situations.  I don't know how these digital stamps are generated and I would like to know an easier way then mentioned previously to get rid of them.</description>
		<content:encoded><![CDATA[<p>I&#8217;ve worked with a lot of legal exhibits that have gone to court and come out of court with what some refer to as Court Branding.  At the top of every page is blue text that identifies the document and page number.  It is that bit of text that interfers with the OCR process.  We are talking about thousands of documents (really) that need to be searchable.  The simple solution is to delete the text, on each and every page, sometimes 300 or 400 pages.  There has got to be a better way.  Tonight, I came across a similar problem with Bates numbers digitally stamped at the bottom of the page.  I could not delete that.  Cropping 600 pages (tonight&#8217;s document) is time consuming and believe me this was a conglomeration of many different documents, different sizes, portrait and landscape.  Cropping would have cut off text that needed to be searched.  I guarantee you this is only the beginning, there will be many more of these types of situations.  I don&#8217;t know how these digital stamps are generated and I would like to know an easier way then mentioned previously to get rid of them.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Bruce Anderson</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-4704</link>
		<author>Bruce Anderson</author>
		<pubDate>Tue, 18 Nov 2008 20:23:59 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-4704</guid>
		<description>Crop the page to remove the renderable text, then Acrobat renders the OCR.</description>
		<content:encoded><![CDATA[<p>Crop the page to remove the renderable text, then Acrobat renders the OCR.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mitch</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-356</link>
		<author>Mitch</author>
		<pubDate>Sat, 23 Feb 2008 06:34:53 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-356</guid>
		<description>Although the blog author works for Adobe, his posts aren't really official recommendations.
-
The 8.1.1 update addresses OCR, but only in regards to Asian language fonts:
http://www.adobe.com/support/downloads/detail.jsp?ftpID=3796
-
There's also an 8.1.2 patch out there. You may want to apply that. No guarantees on OCR improvement.
http://www.adobe.com/support/downloads/detail.jsp?ftpID=3849
-
Best of luck,
Mitch</description>
		<content:encoded><![CDATA[<p>Although the blog author works for Adobe, his posts aren&#8217;t really official recommendations.<br />
-<br />
The 8.1.1 update addresses OCR, but only in regards to Asian language fonts:<br />
<a href="http://www.adobe.com/support/downloads/detail.jsp?ftpID=3796" rel="nofollow">http://www.adobe.com/support/downloads/detail.jsp?ftpID=3796</a><br />
-<br />
There&#8217;s also an 8.1.2 patch out there. You may want to apply that. No guarantees on OCR improvement.<br />
<a href="http://www.adobe.com/support/downloads/detail.jsp?ftpID=3849" rel="nofollow">http://www.adobe.com/support/downloads/detail.jsp?ftpID=3849</a><br />
-<br />
Best of luck,<br />
Mitch</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ederosia</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-354</link>
		<author>ederosia</author>
		<pubDate>Sat, 23 Feb 2008 02:30:59 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-354</guid>
		<description>&lt;p&gt;My mistake.  I thought you were affiliated with Adobe because of the URL and your use of the Acrobat logo.  &lt;/p&gt;
&lt;p&gt;Can you please tell me what you think of this blog entry, written by an Adobe employee?  It's at http://blogs.adobe.com/acrolaw/2007/06/acrobat_81_update_fix_for_render.html  The author describes a fix made by Adobe to this whole problem.  However, I've tried the steps the writer recommends, and it didn't solve the problem for me.  Furthermore, I've read the Adobe Knowledge Base Article to which the author refers, and it doesn't even refer to the fix he described.  But, as I say, he seems at least semi-affiliated with Adobe.  Can you comment on whether Adobe really has fixed this problem?&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>My mistake.  I thought you were affiliated with Adobe because of the URL and your use of the Acrobat logo.  </p>
<p>Can you please tell me what you think of this blog entry, written by an Adobe employee?  It&#8217;s at <a href="http://blogs.adobe.com/acrolaw/2007/06/acrobat_81_update_fix_for_render.html" rel="nofollow">http://blogs.adobe.com/acrolaw/2007/06/acrobat_81_update_fix_for_render.html</a>  The author describes a fix made by Adobe to this whole problem.  However, I&#8217;ve tried the steps the writer recommends, and it didn&#8217;t solve the problem for me.  Furthermore, I&#8217;ve read the Adobe Knowledge Base Article to which the author refers, and it doesn&#8217;t even refer to the fix he described.  But, as I say, he seems at least semi-affiliated with Adobe.  Can you comment on whether Adobe really has fixed this problem?</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mitch</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-353</link>
		<author>Mitch</author>
		<pubDate>Sat, 23 Feb 2008 00:16:21 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-353</guid>
		<description>&lt;p&gt;ederosia,&lt;br /&gt;
We agree with you - Acrobat's OCR function is far from perfect. Currently our best workaround is the PDF to TIFF to PDF option.&lt;/p&gt;
&lt;p&gt;Please note that we are not affiliated with Adobe, we merely offer our suggestions to the Acrobat community as a free service.&lt;br /&gt;
Mitch&lt;/p&gt;
</description>
		<content:encoded><![CDATA[<p>ederosia,<br />
We agree with you - Acrobat&#8217;s OCR function is far from perfect. Currently our best workaround is the PDF to TIFF to PDF option.</p>
<p>Please note that we are not affiliated with Adobe, we merely offer our suggestions to the Acrobat community as a free service.<br />
Mitch</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ederosia</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-352</link>
		<author>ederosia</author>
		<pubDate>Fri, 22 Feb 2008 23:10:46 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-352</guid>
		<description>Or, if ignoring the renderable text is somehow difficult for Adobe to do, how about a function within Acrobat that converts the entire document into bitmapped form?  In essence, it would "flatten" the entire document (renderable text and all) into a bitmapped form.  It would accomplish *within* Acrobat what the silly TIFF-export/import workaround accomplishes.  For the user, this single extra step wouldn't be a big deal.  

Let me add that Adobe has often described this issue in support forums as if it were a user problem.  In essence, they have said, "The stupid user is trying to OCR a document that doesn't need it!"  (e.g., see http://acrobatsupport.com/document-contains-renderable-text)  But, please understand, that we really do get it.  The document really does need to be OCRed.  It's just that the OCR is prevented by a little bit of rendered text that someone has added somewhere to the document (e.g., a little notice at the bottom of the page).  Don't write us off as idiots.  This error does not ONLY come when someone is trying to OCR a document that has already been OCRed.</description>
		<content:encoded><![CDATA[<p>Or, if ignoring the renderable text is somehow difficult for Adobe to do, how about a function within Acrobat that converts the entire document into bitmapped form?  In essence, it would &#8220;flatten&#8221; the entire document (renderable text and all) into a bitmapped form.  It would accomplish *within* Acrobat what the silly TIFF-export/import workaround accomplishes.  For the user, this single extra step wouldn&#8217;t be a big deal.  </p>
<p>Let me add that Adobe has often described this issue in support forums as if it were a user problem.  In essence, they have said, &#8220;The stupid user is trying to OCR a document that doesn&#8217;t need it!&#8221;  (e.g., see <a href="http://acrobatsupport.com/document-contains-renderable-text" rel="nofollow">http://acrobatsupport.com/document-contains-renderable-text</a>)  But, please understand, that we really do get it.  The document really does need to be OCRed.  It&#8217;s just that the OCR is prevented by a little bit of rendered text that someone has added somewhere to the document (e.g., a little notice at the bottom of the page).  Don&#8217;t write us off as idiots.  This error does not ONLY come when someone is trying to OCR a document that has already been OCRed.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: ederosia</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-351</link>
		<author>ederosia</author>
		<pubDate>Fri, 22 Feb 2008 22:01:43 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-351</guid>
		<description>One little bit of renderable text at the bottom of each page makes it impossible to OCR the thing!  Frankly, the TIFF workaround is terrible.  It's difficult for me to think of a more tedious solution.  Why can't Acrobat simply IGNORE the renderable text?!?  For the kind of money we paid for this program, I expect better solutions than "convert the entire document into TIFF and then import it back into Acrobat!"  Honestly, this has been a problem for years and years.  Please fix this problem.</description>
		<content:encoded><![CDATA[<p>One little bit of renderable text at the bottom of each page makes it impossible to OCR the thing!  Frankly, the TIFF workaround is terrible.  It&#8217;s difficult for me to think of a more tedious solution.  Why can&#8217;t Acrobat simply IGNORE the renderable text?!?  For the kind of money we paid for this program, I expect better solutions than &#8220;convert the entire document into TIFF and then import it back into Acrobat!&#8221;  Honestly, this has been a problem for years and years.  Please fix this problem.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: chriscoyier</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-13</link>
		<author>chriscoyier</author>
		<pubDate>Tue, 18 Sep 2007 02:59:38 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-13</guid>
		<description>@D. Peterson: I can see how this could be a bit frustrating, but it would be unnecessary to "disassemble the entire file and reassemble" it. You could extract the single page, use the above steps to save as a TIF and run OCR on that, and then delete the existing page and insert the new one. Just as easy on an 800 page PDF as a 10 page PDF.</description>
		<content:encoded><![CDATA[<p>@D. Peterson: I can see how this could be a bit frustrating, but it would be unnecessary to &#8220;disassemble the entire file and reassemble&#8221; it. You could extract the single page, use the above steps to save as a TIF and run OCR on that, and then delete the existing page and insert the new one. Just as easy on an 800 page PDF as a 10 page PDF.</p>
]]></content:encoded>
	</item>
	<item>
		<title>By: Mitch</title>
		<link>http://acrobatsupport.com/document-contains-renderable-text/#comment-12</link>
		<author>Mitch</author>
		<pubDate>Tue, 18 Sep 2007 00:17:05 +0000</pubDate>
		<guid>http://acrobatsupport.com/document-contains-renderable-text/#comment-12</guid>
		<description>In your case I would look at how the document was originally scanned in. If you open the scanned PDF file and go to File &#62; Properties and look for the PDF Producer. My guess is that it will say something other than Adobe Acrobat, probably the name of your scanner. If this is the case, this means that the PDF is third party and may not work correctly with Acrobat. Adobe has two recommended methods for scanning.
1. Go to File &#62; Create PDF &#62; From Scanner &#62; choose your scanner and click Scan or
2. Scan to an image format with your scanner software and then convert to PDF.

Using either of the above methods will not produce the "Renderable Text" error.

Hope this helps,
Mitch 


Please note that we are not affiliated with Adobe Systems.</description>
		<content:encoded><![CDATA[<p>In your case I would look at how the document was originally scanned in. If you open the scanned PDF file and go to File &gt; Properties and look for the PDF Producer. My guess is that it will say something other than Adobe Acrobat, probably the name of your scanner. If this is the case, this means that the PDF is third party and may not work correctly with Acrobat. Adobe has two recommended methods for scanning.<br />
1. Go to File &gt; Create PDF &gt; From Scanner &gt; choose your scanner and click Scan or<br />
2. Scan to an image format with your scanner software and then convert to PDF.</p>
<p>Using either of the above methods will not produce the &#8220;Renderable Text&#8221; error.</p>
<p>Hope this helps,<br />
Mitch </p>
<p>Please note that we are not affiliated with Adobe Systems.</p>
]]></content:encoded>
	</item>
</channel>
</rss>
