使用 iTextSharp 从 PDF 中去除水印

本文介绍了使用 iTextSharp 从 PDF 中去除水印的处理方法,对大家解决问题具有一定的参考价值

问题描述

我使用 Pdfstamper 在 pdf 上添加了水印.代码如下:

I added a watermark on pdf using Pdfstamper. Here is the code:

for (int pageIndex = 1; pageIndex <= pageCount; pageIndex++)
{
    iTextSharp.text.Rectangle pageRectangle = reader.GetPageSizeWithRotation(pageIndex);
    PdfContentByte pdfData = stamper.GetUnderContent(pageIndex);
    pdfData.SetFontAndSize(BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, 
        BaseFont.NOT_EMBEDDED), watermarkFontSize);
    PdfGState graphicsState = new PdfGState();
    graphicsState.FillOpacity = watermarkFontOpacity;
    pdfData.SetGState(graphicsState);
    pdfData.SetColorFill(iTextSharp.text.BaseColor.BLACK);
    pdfData.BeginText();
    pdfData.ShowTextAligned(PdfContentByte.ALIGN_CENTER, "LipikaChatterjee", 
        pageRectangle.Width / 2, pageRectangle.Height / 2, watermarkRotation);
    pdfData.EndText();
}

这很好用.现在我想从我的 pdf 中删除这个水印.我查看了 iTextSharp,但无法获得任何帮助.我什至尝试将水印添加为图层,然后删除图层,但无法从 pdf 中删除图层的内容.我查看了 iText 以进行图层删除,并找到了一个 OCGRemover 类,但我无法在 iTextsharp 中获得等效的类.

This works fine. Now I want to remove this watermark from my pdf. I looked into iTextSharp but was not able to get any help. I even tried to add watermark as layer and then delete the layer but was not able to delete the content of layer from the pdf. I looked into iText for layer removal and found a class OCGRemover but I was not able to get an equivalent class in iTextsharp.

推荐答案

我将根据我什至尝试将水印添加为图层"这一陈述为您提供怀疑的好处,并假设您正在处理您正在创建的内容,而不是试图为其他人的内容取消水印.

I'm going to give you the benefit of the doubt based on the statement "I even tried to add watermark as layer" and assume that you are working on content that you are creating and not trying to unwatermark someone else's content.

PDF 使用可选内容组 (OCG) 将对象存储为图层.如果您将水印文本添加到图层中,您以后可以很容易地将其删除.

PDFs use Optional Content Groups (OCG) to store objects as layers. If you add your watermark text to a layer you can fairly easily remove it later.

下面的代码是一个完整的 C# 2010 WinForms 应用程序,目标是 iTextSharp 5.1.1.0.它使用基于 Bruno 在此处找到的原始 Java 代码的代码.代码分为三个部分.第 1 部分创建了一个示例 PDF 供我们使用.第 2 节从第一个创建一个新的 PDF,并将水印应用到单独图层上的每个页面.第 3 节从第二个创建最终的 PDF,但删除了带有水印文本的图层.有关其他详细信息,请参阅代码注释.

The code below is a full working C# 2010 WinForms app targeting iTextSharp 5.1.1.0. It uses code based on Bruno's original Java code found here. The code is in three sections. Section 1 creates a sample PDF for us to work with. Section 2 creates a new PDF from the first and applies a watermark to each page on a separate layer. Section 3 creates a final PDF from the second but removes the layer with our watermark text. See the code comments for additional details.

当您创建 PdfLayer 对象时,您可以为其指定一个名称以显示在 PDF 阅读器中.不幸的是,我找不到访问此名称的方法,因此下面的代码会查找图层中的实际水印文本.如果您不使用其他 PDF 图层,我建议在内容流中查找 /OC,而不是浪费时间查找实际的水印文本.如果您找到一种方法来按名称查找 /OC 组,请告诉我哇!

When you create a PdfLayer object you can assign it a name to appear within a PDF reader. Unfortunately I can't find a way to access this name so the code below looks for the actual watermark text within the layer. If you aren't using additional PDF layers I would recommend only looking for /OC within the content stream and not wasting time looking for your actual watermark text. If you find a way to look for /OC groups by name please let me kwow!

using System;
using System.Windows.Forms;
using System.IO;
using iTextSharp.text;
using iTextSharp.text.pdf;

namespace WindowsFormsApplication1 {
    public partial class Form1 : Form {
        public Form1() {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e) {
            string workingFolder = Environment.GetFolderPath(Environment.SpecialFolder.Desktop);
            string startFile = Path.Combine(workingFolder, "StartFile.pdf");
            string watermarkedFile = Path.Combine(workingFolder, "Watermarked.pdf");
            string unwatermarkedFile = Path.Combine(workingFolder, "Un-watermarked.pdf");
            string watermarkText = "This is a test";

            //SECTION 1
            //Create a 5 page PDF, nothing special here
            using (FileStream fs = new FileStream(startFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
                using (Document doc = new Document(PageSize.LETTER)) {
                    using (PdfWriter witier = PdfWriter.GetInstance(doc, fs)) {
                        doc.Open();

                        for (int i = 1; i <= 5; i++) {
                            doc.NewPage();
                            doc.Add(new Paragraph(String.Format("This is page {0}", i)));
                        }

                        doc.Close();
                    }
                }
            }

            //SECTION 2
            //Create our watermark on a separate layer. The only different here is that we are adding the watermark to a PdfLayer which is an OCG or Optional Content Group
            PdfReader reader1 = new PdfReader(startFile);
            using (FileStream fs = new FileStream(watermarkedFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
                using (PdfStamper stamper = new PdfStamper(reader1, fs)) {
                    int pageCount1 = reader1.NumberOfPages;
                    //Create a new layer
                    PdfLayer layer = new PdfLayer("WatermarkLayer", stamper.Writer);
                    for (int i = 1; i <= pageCount1; i++) {
                        iTextSharp.text.Rectangle rect = reader1.GetPageSize(i);
                        //Get the ContentByte object
                        PdfContentByte cb = stamper.GetUnderContent(i);
                        //Tell the CB that the next commands should be "bound" to this new layer
                        cb.BeginLayer(layer);
                        cb.SetFontAndSize(BaseFont.CreateFont(BaseFont.HELVETICA, BaseFont.CP1252, BaseFont.NOT_EMBEDDED), 50);
                        PdfGState gState = new PdfGState();
                        gState.FillOpacity = 0.25f;
                        cb.SetGState(gState);
                        cb.SetColorFill(BaseColor.BLACK);
                        cb.BeginText();
                        cb.ShowTextAligned(PdfContentByte.ALIGN_CENTER, watermarkText, rect.Width / 2, rect.Height / 2, 45f);
                        cb.EndText();
                        //"Close" the layer
                        cb.EndLayer();
                    }
                }
            }

            //SECTION 3
            //Remove the layer created above
            //First we bind a reader to the watermarked file, then strip out a bunch of things, and finally use a simple stamper to write out the edited reader
            PdfReader reader2 = new PdfReader(watermarkedFile);

            //NOTE, This will destroy all layers in the document, only use if you don't have additional layers
            //Remove the OCG group completely from the document.
            //reader2.Catalog.Remove(PdfName.OCPROPERTIES);

            //Clean up the reader, optional
            reader2.RemoveUnusedObjects();

            //Placeholder variables
            PRStream stream;
            String content;
            PdfDictionary page;
            PdfArray contentarray;

            //Get the page count
            int pageCount2 = reader2.NumberOfPages;
            //Loop through each page
            for (int i = 1; i <= pageCount2; i++) {
                //Get the page
                page = reader2.GetPageN(i);
                //Get the raw content
                contentarray = page.GetAsArray(PdfName.CONTENTS);
                if (contentarray != null) {
                    //Loop through content
                    for (int j = 0; j < contentarray.Size; j++) {
                        //Get the raw byte stream
                        stream = (PRStream)contentarray.GetAsStream(j);
                        //Convert to a string. NOTE, you might need a different encoding here
                        content = System.Text.Encoding.ASCII.GetString(PdfReader.GetStreamBytes(stream));
                        //Look for the OCG token in the stream as well as our watermarked text
                        if (content.IndexOf("/OC") >= 0 && content.IndexOf(watermarkText) >= 0) {
                            //Remove it by giving it zero length and zero data
                            stream.Put(PdfName.LENGTH, new PdfNumber(0));
                            stream.SetData(new byte[0]);
                        }
                    }
                }
            }

            //Write the content out
            using (FileStream fs = new FileStream(unwatermarkedFile, FileMode.Create, FileAccess.Write, FileShare.None)) {
                using (PdfStamper stamper = new PdfStamper(reader2, fs)) {

                }
            }
            this.Close();
        }
    }
}

这篇关于使用 iTextSharp 从 PDF 中去除水印的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,WP2

admin_action_{$_REQUEST[‘action’]}

do_action( "admin_action_{$_REQUEST[‘action’]}" )动作钩子::在发送“Action”请求变量时激发。Action Hook: Fires when an ‘action’ request variable is sent.目录锚点:#说明#源码说明(Description)钩子名称的动态部分$_REQUEST['action']引用从GET或POST请求派生的操作。源码(Source)更新版本源码位置使用被使用2.6.0 wp-admin/admin.php:...

日期:2020-09-02 17:44:16 浏览:1169

admin_footer-{$GLOBALS[‘hook_suffix’]}

do_action( "admin_footer-{$GLOBALS[‘hook_suffix’]}", string $hook_suffix )操作挂钩:在默认页脚脚本之后打印脚本或数据。Action Hook: Print scripts or data after the default footer scripts.目录锚点:#说明#参数#源码说明(Description)钩子名的动态部分,$GLOBALS['hook_suffix']引用当前页的全局钩子后缀。参数(Parameters)参数类...

日期:2020-09-02 17:44:20 浏览:1069

customize_save_{$this->id_data[‘base’]}

do_action( "customize_save_{$this-&gt;id_data[‘base’]}", WP_Customize_Setting $this )动作钩子::在调用WP_Customize_Setting::save()方法时激发。Action Hook: Fires when the WP_Customize_Setting::save() method is called.目录锚点:#说明#参数#源码说明(Description)钩子名称的动态部分,$this->id_data...

日期:2020-08-15 15:47:24 浏览:806

customize_value_{$this->id_data[‘base’]}

apply_filters( "customize_value_{$this-&gt;id_data[‘base’]}", mixed $default )过滤器::过滤未作为主题模式或选项处理的自定义设置值。Filter Hook: Filter a Customize setting value not handled as a theme_mod or option.目录锚点:#说明#参数#源码说明(Description)钩子名称的动态部分,$this->id_date['base'],指的是设置...

日期:2020-08-15 15:47:24 浏览:898

get_comment_author_url

过滤钩子:过滤评论作者的URL。Filter Hook: Filters the comment author’s URL.目录锚点:#源码源码(Source)更新版本源码位置使用被使用 wp-includes/comment-template.php:32610...

日期:2020-08-10 23:06:14 浏览:930

network_admin_edit_{$_GET[‘action’]}

do_action( "network_admin_edit_{$_GET[‘action’]}" )操作挂钩:启动请求的处理程序操作。Action Hook: Fires the requested handler action.目录锚点:#说明#源码说明(Description)钩子名称的动态部分$u GET['action']引用请求的操作的名称。源码(Source)更新版本源码位置使用被使用3.1.0 wp-admin/network/edit.php:3600...

日期:2020-08-02 09:56:09 浏览:876

network_sites_updated_message_{$_GET[‘updated’]}

apply_filters( "network_sites_updated_message_{$_GET[‘updated’]}", string $msg )筛选器挂钩:在网络管理中筛选特定的非默认站点更新消息。Filter Hook: Filters a specific, non-default site-updated message in the Network admin.目录锚点:#说明#参数#源码说明(Description)钩子名称的动态部分$_GET['updated']引用了非默认的...

日期:2020-08-02 09:56:03 浏览:864

pre_wp_is_site_initialized

过滤器::过滤在访问数据库之前是否初始化站点的检查。Filter Hook: Filters the check for whether a site is initialized before the database is accessed.目录锚点:#源码源码(Source)更新版本源码位置使用被使用 wp-includes/ms-site.php:93910...

日期:2020-07-29 10:15:38 浏览:833

WordPress 的SEO 教学:如何在网站中加入关键字(Meta Keywords)与Meta 描述(Meta Description)?

你想在WordPress 中添加关键字和meta 描述吗?关键字和meta 描述使你能够提高网站的SEO。在本文中,我们将向你展示如何在WordPress 中正确添加关键字和meta 描述。为什么要在WordPress 中添加关键字和Meta 描述?关键字和说明让搜寻引擎更了解您的帖子和页面的内容。关键词是人们寻找您发布的内容时,可能会搜索的重要词语或片语。而Meta Description则是对你的页面和文章的简要描述。如果你想要了解更多关于中继标签的资讯,可以参考Google的说明。Meta 关键字和描...

日期:2020-10-03 21:18:25 浏览:1722

谷歌的SEO是什么

SEO (Search Engine Optimization)中文是搜寻引擎最佳化,意思近于「关键字自然排序」、「网站排名优化」。简言之,SEO是以搜索引擎(如Google、Bing)为曝光媒体的行销手法。例如搜寻「wordpress教学」,会看到本站的「WordPress教学:12个课程…」排行Google第一:关键字:wordpress教学、wordpress课程…若搜寻「网站架设」,则会看到另一个网页排名第1:关键字:网站架设、架站…以上两个网页,每月从搜寻引擎导入自然流量,达2万4千:每月「有机搜...

日期:2020-10-30 17:23:57 浏览:1308